Learning Clojure

I completed the first 50 problems at 4clojure.com.

The site helps you learn Clojure in the common way by presenting you with a long set of tasks.

As well as setting the problems it tests your answers live in the browser.

You can follow other users. Once you complete a problem you can see their solutions.

Sometimes that leads you from a solution that looks like this:

fn [coll]
  ((fn dist [prev coll]
     (lazy-seq
      (when-let [[x & xs] (seq coll)]
        (let [more (dist (conj prev x) xs)]
          (if (contains? prev x)
            more
            (cons x more))))))
   #{} coll))

to one that looks like this:

reduce #(if (some #{%2} %) % (conj % %2)) []

Good stuff.

Practical URL Validation

There’s a lot of code out there that deals with URL validation. Basically all of it is concerned with “does this URL meet the RFC spec?” That’s actually not that interesting a question if you are validating user input. You don’t want ‘gopher://whatever/’ or ‘http://10.1.2.3’ as valid URLs in your system if you have asked the user for the address of a web page.

What’s more, because of punycode-powered internationalized URLs most URL validating code will tell you that real URLs that people can use today are invalid (PHP’s parse_url is not even utf8 safe).

Here’s some PHP code that validates URLs in a more practical way. It uses the list of TLDs in static::$validTlds from the IANA list of valid TLDs and assumes the presence of a utf8-safe $this->parseUrl such as Joomla’s version.

    (c) 2013 Thomas David Baker, MIT License

    /**
     * Return true if url is valid, false otherwise.
     *
     * Note that this is not the RFC definiton of a valid URL.  For example we
     * differ from the RFC in only accepting http and https URLs, not accepting
     * single word hosts, and accepting any characters in hostnames (as modern
     * browsers will punycode translate them to ASCII automatically).
     *
     * @param string $url Url to validate.  Must include 'scheme://' to have any
     *                    chance of validating.
     *
     * @return boolean
     */
    public function validUrl($url) {
        $parts = $this->parseUrl($url);

        // We must be able to recognize this as some form of URL.
        if (!$parts) {
            return false;
        }

        // SCHEME.
        // Must be qualified with a scheme.
        if (!isset($parts['scheme']) || !$parts['scheme']) {
            return false;
        }
        // Only http and https are acceptable.  No ftp or similar.
        if (!in_array($parts['scheme'], ['http', 'https'])) {
            return false;
        }

        // CHECK FOR 'EXTRA PARTS'.
        // If a URL has unrecognized bits then it is not valid - for example the
        // 'z' in 'www.google.com:80z'.
        // This check invalidates URLs that use a user - we don't allow those.
        $partsCheck = $parts;
        $partsCheck['scheme'] .= '://';
        if (isset($partsCheck['port'])) {
            $partsCheck['port'] = ':' . $partsCheck['port'];
        }
        if (isset($partsCheck['query'])) {
            $partsCheck['query'] = '?' . $partsCheck['query'];
        }
        if (isset($partsCheck['fragment'])) {
            $partsCheck['fragment'] = '#' . $partsCheck['fragment'];
        }
        if (implode('', $partsCheck) !== $url) {
            return false;
        }

        // HOST.
        if (!isset($parts['host']) || !$parts['host']) {
            return false;
        }
        // Single word hosts are not acceptable.
        if (strpos($parts['host'], '.') === false) {
            return false;
        }
        if (strpos($parts['host'], ' ') !== false) {
            return false;
        }
        if (strpos($parts['host'], '--') !== false) {
            return false;
        }
        if (strpos($parts['host'], '-') === 0) {
            return false;
        }
        // Cope with internationalized domain names.
        $host = idn_to_ascii($parts['host']);

        $hostSegments = explode('.', $host);
        // The IANA lists TLDs in uppercase, so we do too.
        $tld = mb_strtoupper(array_pop($hostSegments));
        if (!$tld) {
            return false;
        }
        if (!in_array(mb_strtoupper($tld), static::$validTlds)) {
            return false;
        }
        $domain = array_pop($hostSegments);
        if (!$domain) {
            return false;
        }

        // PATH.
        if (isset($parts['path']) && substr($parts['path'], 0, 1) !== '/') {
            return false;
        }

        // If you made it this far you're golden.
        return true;
    }

Looking at the list of interesting URLs from http://mathiasbynens.be/demo/url-regex and elsewhere it allows all of the following:

http://foo.com/blah_blah
http://foo.com/blah_blah/
http://foo.com/blah_blah_(wikipedia)
http://foo.com/blah_blah_(wikipedia)_(again)
http://www.example.com/wpstyle/?p=364
https://www.example.com/foo/?bar=baz&inga=42&quux
http://✪df.ws/123
http://➡.ws/䨹
http://⌘.ws
http://⌘.ws/
http://foo.com/blah_(wikipedia)#cite-1
http://foo.com/blah_(wikipedia)_blah#cite-1
http://foo.com/unicode_(✪)_in_parens
http://foo.com/(something)?after=parens
http://☺.damowmow.com/
http://code.google.com/events/#&product=browser
http://j.mp
http://foo.com/?q=Test%20URL-encoded%20stuff
http://مثال.إختبار
http://例子.测试
http://उदाहरण.परीक्षा
http://1337.net
http://a.b-c.de

And disallows all of these:

# Invalid URLs
http://
http://.
http://..
http://../
http://?
http://??
http://??/
http://#
http://##
http://##/
http://foo.bar?q=Spaces should be encoded
//
//a
///a
///
http:///a
foo.com
rdar://1234
h://test
http:// shouldfail.com
:// should fail
http://foo.bar/foo(bar)baz quux
ftps://foo.bar/
http://-error-.invalid/
http://a.b--c.de/
http://-a.b.co
http://0.0.0.0
http://10.1.1.0
http://10.1.1.255
http://224.1.1.1
http://1.1.1.1.1
http://123.123.123
http://3628126748
http://.www.foo.bar/
http://www.foo.bar./
http://.www.foo.bar./
http://10.1.1.1
http://10.1.1.254
# The following URLs are valid by the letter of the law but we don't want to allow them.
http://userid:password@example.com:8080
http://userid:password@example.com:8080/
http://userid@example.com
http://userid@example.com/
http://userid@example.com:8080
http://userid@example.com:8080/
http://userid:password@example.com
http://userid:password@example.com/
http://-.~_!$&\'()*+,;=:%40:80%2f::::::@example.com
http://142.42.1.1/
http://142.42.1.1:8080/
http://223.255.255.254
ftp://foo.bar/baz

parse_url Is Not UTF-8 Safe

Handily the good folks at Joomla have written a UTF-8 safe version:

        /**
	 * Does a UTF-8 safe version of PHP parse_url function
	 *
	 * @param   string  $url  URL to parse
	 *
	 * @return  mixed  Associative array or false if badly formed URL.
	 *
	 * @see     http://us3.php.net/manual/en/function.parse-url.php
	 * @since   11.1
	 */
	public static function parse_url($url)
	{
		$result = false;

		// Build arrays of values we need to decode before parsing
		$entities = array('%21', '%2A', '%27', '%28', '%29', '%3B', '%3A', '%40', '%26', '%3D', '%24', '%2C', '%2F', '%3F', '%23', '%5B', '%5D');
		$replacements = array('!', '*', "'", "(", ")", ";", ":", "@", "&", "=", "$", ",", "/", "?", "#", "[", "]");

		// Create encoded URL with special URL characters decoded so it can be parsed
		// All other characters will be encoded
		$encodedURL = str_replace($entities, $replacements, urlencode($url));

		// Parse the encoded URL
		$encodedParts = parse_url($encodedURL);

		// Now, decode each value of the resulting array
		if ($encodedParts)
		{
			foreach ($encodedParts as $key => $value)
			{
				$result[$key] = urldecode(str_replace($replacements, $entities, $value));
			}
		}
		return $result;
	}

Although non-ASCII characters are not legal in URLs if you want to parse possibly wonky data or internationalized (例子.测试) and other non-ASCII URLs (✪df.ws) that translate to ASCII via Punycode then this is very handy.

Keeping Control of Your Z-Indexes With LESS

A colleague of mine recently took advantage of LESS in a simple but very useful way.

/* Relatively positioned dialog overlay. */
@z-index-dlg-rel-overlay: -2;

/* Position of box shadow to dialog. */
@z-index-dlg-box-shadow: -1;

/* Close button on dialog boxes. */
@z-index-dlg-close: 1;

/* Overlay on the dialog itself making it "disabled." */
@z-index-dlg-disabled: 2;

/* Modal dialog position. */
@z-index-dialog: 100;

/* Main window overlay for dialog. */
@z-index-dlg-overlay: 1000;

/* Position of main (topmost) dialog. */
@z-index-dlg-main: 10000;

/* Main content loading spinner. */
@z-index-cnt-loading: 1;

/* Position of orangle flag triangle on auth pages. */
@z-index-ath-triangle: 1;

/* Absolutely positioned delete (X) button. */
@z-index-btn-delete: 10;

/* Position of downward triangle on content title. */
@z-index-tri-title: 5;

/* Position drawers below dialogs. */
@z-index-drawer: @z-index-dialog - 1;

/* Position of floating navbar. */
@z-index-navbar: 1000;

/* Position of popup about menu. */
@z-index-nav-about: 100;

/* Position of search box on nav menu. */
@z-index-nav-search: 1;

He gathered every z-index on the site into a single LESS file as a variable. Now whenever you add something with a z-index you can think about “yes, I want it on top but not over the floating nav or a modal dialog” rather than just jamming 999999 and hoping for the best. Neat!

Vertically Centering Text in a Fixed-Size Element With No Overflow

Most of the blog posts/tutorials on the web that show you how to center text vertically using either line-height (only works for a single line of text) or using display: table-cell which allows the text to exceed a fixed size (table cells are allowed to get taller than their height if the text is long). Here’s a version that clips the overflow by surrounding the display: table-cell element in a containing div.

Vertically centered text in fixed sized boxes

<style>
    /* Wrapper exists to prevent long text exceeding the bounding box. */
    .wrapper {
        /* height and width here act as a maximum for when the text is very long. */
        height: 100px;
        width: 100px;
        overflow: hidden;
        background: #eee; /* just so we can see what's going on in the demo */
    }
    /* Cell does the work of allowing us to vertical-align: middle. */
    .cell {
        /* height and width here act as a minimum for when the text is not long. */
        height: 100px; /* subtract any padding on wrapper from height and width here */
        width: 100px;
        display: table-cell;
        text-align: center;
        vertical-align: middle;
    }
</style>

<div class="wrapper">
        <div class="cell">
            This is a test.
        </div>
</div>
<hr>
<div class="wrapper">
        <div class="cell">
            This is a test with much longer text.  This is a test with much longer text.  This is a test with much longer text.  This is a test with much longer text.  This is a test with much longer text.
    </div>
</div>

How to Get Reasonably Priced Pay As You Go Data (and Calls and Texts) on an Unlocked iPhone in the US

Here is a system that has worked for me in the USA on an iPhone 4 and an iPhone 5 in July 2012 and February 2013. Both phones were not locked to any particular carrier but they were not jailbroken.

  1. Visit an AT&T store or otherwise get a GoPhone SIM that fits your phone (micro sim or nano sim). Choose whatever rate plan suits your needs. I’m partial to the $25 for 30 days unlimited texts and 250 minutes. Note that the salesperson will almost certainly tell you that you won’t be able to get data on this plan – they are wrong just say ok!
  2. Connect your phone to wifi somewhere (Starbucks if all else fails!) Visit unlockit.co.nz through your phone’s browser. This site is going to make twiddling your internet settings a lot easier – it doesn’t jailbreak or carrier-unlock your phone it just changes internet access settings in a simple way. Choose Create APN, “USA” and “GoPhone”. Press “Install” to install the settings.
  3. Set up a PIN for your account by getting it texted to the phone. You can choose “Forgot Password” from the GoPhone login page to set it up.
  4. Sign in to your GoPhone account and add a data package (200MB for 30 days for $15 for light usage or 1GB for 30 days for $25 for heavier usage). You’ll need a credit/debit card to do this entirely online or you can use a refill card you bought in a store.

And that’s it. You may need to remove the APN settings that you installed when/if you leave the US.

How to Create Movie Barcodes

How to make Movie Barcodes (OS X)

Inspired by the movie barcode tumblr I wanted to make some barcodes of my wife’s films.

Here’s one I made for her latest short:

Twinkle, Twinkle Barcode
Twinkle, Twinkle Barcode

I found some instructions on Mr. Reid’s site and with the help of the comments and some tinkering I came up with this final process on OSX.

You will need mplayer and ImageMagick installed, both of which can be found in MacPorts.

$ mkdir /tmp/barcode
$ cd /tmp/barcode
$ mplayer -framedrop -speed 100 -vf framestep=90 -nosound -vo jpeg [path/to/movie-file]
$ mogrify *.jpg -resize 1x288\! *.jpg
$ montage -geometry +0+0 -tile x1 *.jpg barcode.png

Deployment – If It Hurts, Do It Again

If deployment hurts, do it again.

Frequent painful deployments force you to automate and improve the process.

Your development environment may be wonderful. Your test coverage may be high. But your code has to go live. If that’s an infrequent process of giant merges and close-your-eyes-and-hope code pushes you risk bugs. “Little and often” reduces risk and decreases pain.

The logical conclusion: continuous deployment.

git add -p

I tell people to use git add -p not git add FILE or git add .

The benefits are huge:

  1. You decide exactly what’s in your commit. Any later inspection of your commit is improved — code review, git blame, git log, git bisect — everything. No more “Changed WidgetManager to use Widgets not Sprockets, corrected docs for SprocketManager, deleted four unnecessary files and added a new log image.”
  2. You review your changes before committing. Another chance to spot bugs or realize that your documentation doesn’t make sense or that you left in a TODO or a debug statement.
  3. You won’t accidentally commit work in progress.

You aren’t welcome on my project if you don’t use git add -p! And git add ., frankly, should be an error!