All posts by peter

Harvesting and semantically tagging media releases from political websites using web services

Here’s the slides from another talk from VALA2012 where I talked about how we’ve been using OpenCalais at the Victorian Parliamentary Library to add tags and semantic data to one of our databases. You can also see the talk here or download a longer paper that goes with the talk.

VALA 2012 Reference Desk Software

I was lucky enough to attend VALA2012 and also present a couple of papers. The first was about some Open Source software that we created at the Victorian Parliamentary Library in order to track reference requests. Its not quite finished, but as soon as its wrangled into a coherent jumble of code, I’ll be putting it up on

I’m quite happy with the system and it got some good comments from people at VALA. Here are the slides I presented that show a bit of the user interface. Programmed using the Play! Framwork.

Life and Literature Code Challenge Entry


The idea for this entry came from the amazing Japanese earthquake map written by Paul Nicholls in which the Japanese earthquakes are shown in a timeline on a google map.

I thought it would be interesting to do something with the Biodiversity Heritage Library data for the Life and Literature Code Challenge. I figured that plotting the place of publication over time might show some interesting patterns.


Continue reading Life and Literature Code Challenge Entry

Building a mobile app backend using MongoDB and Slim – a PHP REST framework

I’ve been toying around in my spare time with HTML5 and building a geographically enabled web app  (possibly making it into a full blown mobile app down the track using PhoneGap or Appcelerator). Anyway, I started off with the back end.

I chose MongoDB as the data store (a nosql database with really simple out of the box geographic indexing). There are a few threads about mongo’s geohashing algorithms not coping at very fine scales, but for my purposes it has all the accuracy I need and the geohashing mechanism is really fast.

I needed a REST interface that my mobile app could use to retrieve nearest locations and to add new locations – both fairly simple requirements as the bulk of the computational work is elegantly handled by the backend database. All I needed was something to build the routes and add in my own validation – this is where Slim comes in.

Slim is a micro framework to build REST services and it does this one thing very well. It allows me to build routes for GET POST PUT and DELETE requests and hand them off to appropriate functions in my data model.  minimal example:


           require 'Slim/Slim.php';
           require 'models/LocationStore.php';

           $app = new Slim();
           $ls  = new LocationStore();

           $app->get('/near/:lat,:lon', function ($lat, $lon) {
                header("Content-Type: application/json");
                echo json_encode($ls->getNear($lat, $lon));



So a GET request to http://myserver/near/-37.8,143.2 would be routed to a function that queries my database for locations near the latitude and longitude passed in. I can also use some neat features built in to Slim to validate the passed in values against a regular expression.

There is more to it of course and a number of templating tools can be plugged in to make it into a more fully featured web framework.

Next up is the front end HTML5 code, but that’s a subject for another post.

Slim framework website:




Upgrading Debian 5 to 6

Debian_logoIt’s never much fun upgrading major versions of operating systems and I always get slightly uncomfortable during the process. Knowing you have a good backup is always handy (thanks linode). In this case I was going from Debian 5 to 6 and as usual, the Debian folks have made it smooth sailing.

This guide on the linode pages was very useful:

All went smoothly when I upgraded my linode from Lenny to Squeeze except that mysql would not start.

This post had the answer I needed.

In short, I had to comment out the ‘skip-bdb’ entry in /etc/mysql/my.cnf and then issue:

apt-get -f install mysql-server

Hosting at Linode

I’ve started hosting all my personal sites at linode. I have root access to my own Virtual machine and I have installed Debian 5.0, lighttpd, mysql, and a bunch of Drupal and WordPress sites. I have found an incredible performance boost compared to shared hosting and it only costs slightly more. I’d highly recommend this for anyone with some knowledge of linux administration. The only downside is backups, but I solved that using backup-manager and creating an Amazon S3 account where the backups get stored (all very easy).

Check out the details at

EDIT: Linode now has a backup solution that is simple and automatic – just what you want!

Attention span of a developer

I play around with a lot of technology and like installing and testing things, but I wonder if I am missing any great systems because of install fatigue – how long is too long to get a system installed, run through the configuration and get some test data in there and running? There is nothing like a good ‘quick start tutorial’ or a ‘build a blog in fine minutes’ to get you going.

A great example of this was when I recently installed geoserver.  Now this is not a simple piece of software, but I was able to follow the documentation to install and get the software up and running in about 20 minutes. If more developers would take the time to produce this kind of documentation with step by step instructions it would really help with adoption of your technology.

nswsphere streamed

There’s a lot of talk about the carbon footprint of attending conferences these days and on Friday I attended my first virtual conference. NSWsphere provided a live stream of the conference. The job they did was excellent (numerous cameras, direct mic etc.) and I was able to watch easily without getting frustrated.

The second important factor was the live twitter stream. This allowed me to tap into some of the intangibles that you get from going to a conference – the important chit-chat on what everyone thought of the presentations. The advantage of twitter was that it was happening as the presenters were talking, so I didn’t have to wait until the session ended to get people’s views.

The last advantage was that I could tune in to just the presenters I was interested in. I simply printed out the agenda and switched over when they were on.

So despite being in another state, I was still able to get something out if this conference without traveling, without spewing out tonnes of carbon and while keeping up with most of my real job. Obviously face-to-face meetings with colleagues are better and I would choose that if I could, but especially for conferences where you might just have a peripherally interest and can’t justify the cost of attending in person, it might be worth giving this a try.