Dead Simple Google Maps API Geolocation

Google turned off the v2 Google Maps API on September 9th which means my 2008 PHP Wrapper for Google Maps API Geocoding has ceased to function.

I’ve put a replacement dead simple PHP wrapper of the v3 Google Maps Geolocation API on github. It has the same API as before.

v2 of the API now gives "We're sorry... ... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now. See Google Help for more information." which isn’t a super useful message.

Learning Clojure

I completed the first 50 problems at

The site helps you learn Clojure in the common way by presenting you with a long set of tasks.

As well as setting the problems it tests your answers live in the browser.

You can follow other users. Once you complete a problem you can see their solutions.

Sometimes that leads you from a solution that looks like this:

fn [coll]
  ((fn dist [prev coll]
      (when-let [[x & xs] (seq coll)]
        (let [more (dist (conj prev x) xs)]
          (if (contains? prev x)
            (cons x more))))))
   #{} coll))

to one that looks like this:

reduce #(if (some #{%2} %) % (conj % %2)) []

Good stuff.

Practical URL Validation

There’s a lot of code out there that deals with URL validation. Basically all of it is concerned with “does this URL meet the RFC spec?” That’s actually not that interesting a question if you are validating user input. You don’t want ‘gopher://whatever/’ or ‘’ as valid URLs in your system if you have asked the user for the address of a web page.

What’s more, because of punycode-powered internationalized URLs most URL validating code will tell you that real URLs that people can use today are invalid (PHP’s parse_url is not even utf8 safe).

Here’s some PHP code that validates URLs in a more practical way. It uses the list of TLDs in static::$validTlds from the IANA list of valid TLDs and assumes the presence of a utf8-safe $this->parseUrl such as Joomla’s version.

    (c) 2013 Thomas David Baker, MIT License

     * Return true if url is valid, false otherwise.
     * Note that this is not the RFC definiton of a valid URL.  For example we
     * differ from the RFC in only accepting http and https URLs, not accepting
     * single word hosts, and accepting any characters in hostnames (as modern
     * browsers will punycode translate them to ASCII automatically).
     * @param string $url Url to validate.  Must include 'scheme://' to have any
     *                    chance of validating.
     * @return boolean
    public function validUrl($url) {
        $parts = $this->parseUrl($url);

        // We must be able to recognize this as some form of URL.
        if (!$parts) {
            return false;

        // SCHEME.
        // Must be qualified with a scheme.
        if (!isset($parts['scheme']) || !$parts['scheme']) {
            return false;
        // Only http and https are acceptable.  No ftp or similar.
        if (!in_array($parts['scheme'], ['http', 'https'])) {
            return false;

        // If a URL has unrecognized bits then it is not valid - for example the
        // 'z' in ''.
        // This check invalidates URLs that use a user - we don't allow those.
        $partsCheck = $parts;
        $partsCheck['scheme'] .= '://';
        if (isset($partsCheck['port'])) {
            $partsCheck['port'] = ':' . $partsCheck['port'];
        if (isset($partsCheck['query'])) {
            $partsCheck['query'] = '?' . $partsCheck['query'];
        if (isset($partsCheck['fragment'])) {
            $partsCheck['fragment'] = '#' . $partsCheck['fragment'];
        if (implode('', $partsCheck) !== $url) {
            return false;

        // HOST.
        if (!isset($parts['host']) || !$parts['host']) {
            return false;
        // Single word hosts are not acceptable.
        if (strpos($parts['host'], '.') === false) {
            return false;
        if (strpos($parts['host'], ' ') !== false) {
            return false;
        if (strpos($parts['host'], '--') !== false) {
            return false;
        if (strpos($parts['host'], '-') === 0) {
            return false;
        // Cope with internationalized domain names.
        $host = idn_to_ascii($parts['host']);

        $hostSegments = explode('.', $host);
        // The IANA lists TLDs in uppercase, so we do too.
        $tld = mb_strtoupper(array_pop($hostSegments));
        if (!$tld) {
            return false;
        if (!in_array(mb_strtoupper($tld), static::$validTlds)) {
            return false;
        $domain = array_pop($hostSegments);
        if (!$domain) {
            return false;

        // PATH.
        if (isset($parts['path']) && substr($parts['path'], 0, 1) !== '/') {
            return false;

        // If you made it this far you're golden.
        return true;

Looking at the list of interesting URLs from and elsewhere it allows all of the following:

And disallows all of these:

# Invalid URLs
http://##/ should be encoded
:// should fail quux
# The following URLs are valid by the letter of the law but we don't want to allow them.