April 18th, 2012

LOD-LAM meeting at Melbourne Museum 17 April 2012

Yesterday I spent a really interesting morning at Museum Victoria attending and presenting at a workshop on Linked Open Data in Libraries, Archives and Musuems (LOD-LAM) organised by Culture Victoria. As usual with these things, there were varying levels of technical and background knowledge on the topic. However, I think the level of the presentations was spot on and judging by the number of questions and discussions happening there was a lot of interest in this topic.

Mia Ridge (@mia_out) gave an excellent introduction to Linked Data and how it is used in libraries, galleries and museums. She’s put together a wiki with links to all sorts of useful stuff.

Below are the slides from my talk on activities at the Victorian Parliamentary Library (astute viewer will notice they are mostly lifted straight out of my VALA talk).

 

March 25th, 2012

Harvesting and semantically tagging media releases from political websites using web services

Here’s the slides from another talk from VALA2012 where I talked about how we’ve been using OpenCalais at the Victorian Parliamentary Library to add tags and semantic data to one of our databases. You can also see the talk here or download a longer paper that goes with the talk.

February 14th, 2012

VALA 2012 Reference Desk Software

I was lucky enough to attend VALA2012 and also present a couple of papers. The first was about some Open Source software that we created at the Victorian Parliamentary Library in order to track reference requests. Its not quite finished, but as soon as its wrangled into a coherent jumble of code, I’ll be putting it up on http://github.com/pov.

I’m quite happy with the system and it got some good comments from people at VALA. Here are the slides I presented that show a bit of the user interface. Programmed using the Play! Framwork.

September 21st, 2011

Life and Literature Code Challenge Entry

Background

The idea for this entry came from the amazing Japanese earthquake map written by Paul Nicholls in which the Japanese earthquakes are shown in a timeline on a google map.

I thought it would be interesting to do something with the Biodiversity Heritage Library data for the Life and Literature Code Challenge. I figured that plotting the place of publication over time might show some interesting patterns.

 

Approach

First I had to get the place of publication out of the BHL data. The Google Geocode API did a fair job of this, although it did get a bit messed up with some of the older titles. I pre-processed the publication field with a few perl scripts to make life a little easier for the geocoder. I also ran up against the geocoder’s API limits, so I don’t have results for all the titles in the BHL.

Once I had some geocodes, I loaded the data into MySQL where I could run some queries to check how well the geocoding went. I then exported this to a large JSON object that gets loaded by the web page. Making use of a jQuery UI button and slider I was able to animate the slider programatically and draw circles on the google map.

You can see it in action here:

 

bhl.neish.net

There are no doubt plenty of geocoding errors still in there (i edited out the main ones that I could see), but the overall patterns are still evident. It slows down a bit on Internet Explorer, but the results are ok when I test on firefox, chrome and safari).

The results are quiet interesting and show how publications start off in Europe and then progress to the rest of the world during the 19th century. The late 19th and early 20th centuries are the boom time and the results taper off  during the 20th century (I guess as we enter the time period covered by copyright). Not many dots popping up over Australia – hopefully that will change as the BHL – Australia digitising gets underway.

 

Next…

This example makes use of only the bare minimum of data from the title table in the BHL (and I didn’t even use every record due to performance issues when too many dots were being created at once). It would be possible to get a lot more dates by linking up the item and title data and running the analysis off the items rather than the titles.

What would be really interesting is if we could link in the page and taxon name data and dynamically generate the data so that you could look at the publications for a particular taxon over time – it might be interesting to see where items about ants are being published or eucalypts for example.

This is a work in progress. Comments welcome.

 

 

August 6th, 2011

Building a mobile app backend using MongoDB and Slim – a PHP REST framework

I’ve been toying around in my spare time with HTML5 and building a geographically enabled web app  (possibly making it into a full blown mobile app down the track using PhoneGap or Appcelerator). Anyway, I started off with the back end.

I chose MongoDB as the data store (a nosql database with really simple out of the box geographic indexing). There are a few threads about mongo’s geohashing algorithms not coping at very fine scales, but for my purposes it has all the accuracy I need and the geohashing mechanism is really fast.

I needed a REST interface that my mobile app could use to retrieve nearest locations and to add new locations – both fairly simple requirements as the bulk of the computational work is elegantly handled by the backend database. All I needed was something to build the routes and add in my own validation – this is where Slim comes in.

Slim is a micro framework to build REST services and it does this one thing very well. It allows me to build routes for GET POST PUT and DELETE requests and hand them off to appropriate functions in my data model.  A minimal example:

<?php

           require 'Slim/Slim.php';
           require 'models/LocationStore.php';

           $app = new Slim();
           $ls  = new LocationStore();

           $app->get('/near/:lat,:lon', function ($lat, $lon) {
                header("Content-Type: application/json");
                echo json_encode($ls->getNear($lat, $lon));
           }); 

           $app->run();

?>

So a GET request to http://myserver/near/-37.8,143.2 would be routed to a function that queries my database for locations near the latitude and longitude passed in. I can also use some neat features built in to Slim to validate the passed in values against a regular expression.

There is more to it of course and a number of templating tools can be plugged in to make it into a more fully featured web framework.

Next up is the front end HTML5 code, but that’s a subject for another post.

Slim framework website: http://www.slimframework.com/

 

 

 

July 1st, 2011

jQuery and DB/Textworks

Below are my slides from a talk I gave to the Melbourne Inmagic User group on jQuery and DB/Textworks. Unfortunately all the web stuff is behind a firewall, so I can’t link to it, however, I’ll put the plugins I developed up on google code.

 

February 9th, 2011

Upgrading Debian 5 to 6

debian logoIt’s never much fun upgrading major versions of operating systems and I always get slightly uncomfortable during the process. Knowing you have a good backup is always handy (thanks linode). In this case I was going from Debian 5 to 6 and as usual, the Debian folks have made it smooth sailing.

This guide on the linode pages was very useful:

http://library.linode.com/troubleshooting/upgrade-to-debian-6-squeeze

All went smoothly when I upgraded my linode from Lenny to Squeeze except that mysql would not start.

This post http://www.robtucker.co.uk/2009/05/16/upgrading-mysql-50-to-51-on-debian-50 had the answer I needed.

In short, I had to comment out the ‘skip-bdb’ entry in /etc/mysql/my.cnf and then issue:

apt-get -f install mysql-server

February 5th, 2011

Creek near my house flooding in Melbourne

Sorry about the low light – it was pretty dark at the time.

December 4th, 2009

Hosting at Linode

I’ve started hosting all my personal sites at linode. I have root access to my own Virtual machine and I have installed Debian 5.0, lighttpd, mysql, and a bunch of Drupal and WordPress sites. I have found an incredible performance boost compared to shared hosting and it only costs slightly more. I’d highly recommend this for anyone with some knowledge of linux administration. The only downside is backups, but I solved that using backup-manager and creating an Amazon S3 account where the backups get stored (all very easy).

Check out the details at linode.com.

EDIT: Linode now has a backup solution that is simple and automatic – just what you want!

Tags:
September 18th, 2009

Attention span of a developer

I play around with a lot of technology and like installing and testing things, but I wonder if I am missing any great systems because of install fatigue – how long is too long to get a system installed, run through the configuration and get some test data in there and running? There is nothing like a good ‘quick start tutorial’ or a ‘build a blog in fine minutes’ to get you going.

A great example of this was when I recently installed geoserver.  Now this is not a simple piece of software, but I was able to follow the documentation to install and get the software up and running in about 20 minutes. If more developers would take the time to produce this kind of documentation with step by step instructions it would really help with adoption of your technology.