Category Archives: Information Retrieval

Weeknotes: Documents and data

The main project this week (apart from hte onging one of moving and virtualising servers) is to begin work on our technical documents. I’m trying to move them onto the web and make the useful, not only in terms of reading about them but also to make them linkable. I’m trying to get them out […]

Research Databases in the Humanities

I went to the Research Databases in the Humanities workshop, organised by Sudamih, which was an excellent afternoon and time well spent. An Oxford heavy event, there were a number of interesting directions that came out of the afternoon. Firstly James Wilson, project manager of Sudamih at Oxford University Computing Services, outlined the Database as […]

Searching Open Correspondence with Xapian

As part of the continuing work on Open Correspondence, I managed to install Xapian to act as a full text search engine. I’ve been looking to do this for a while and had started on working on a remote back end (as blogged here) but decided not to use it as it appears to have […]

Finding the data signal in the noise

Marshall Kirkpatrick, on ReadWriteWeb, poses the question A web of infinite information: does that sound like a scary problem of “just too much”? in a “Mamas, Don’t Let Your Babies Grow Up to Be Data Wranglers” where he discusses an interview with Evan Williams on GigaOm. (I’m not going to discuss the interview here (but […]

Hacking Arts Council data

I lost my hackday cherry yesterday and went to the Open Data hackathon to look at the South East arts council data found at the site ( Our hosts, White October, were fantastic and welcoming (and put the kettle on as soon as I came in!) and Incuna provided the much needed pizzas for […]

Weeknotes: Open Correspondence, Xapian and Linked Data

After last week’s server move, we discovered one or two things that needed to be changed before they could go live. The main thing was the Xapian search which I had been working on. The initial version kept the Xapian server on the local machine and used that to index and search the letters butt […]

Tweeting changes with Node.js

As a break from Open Correspondence, I’ve been looking at node.js, the server side Javascript library. I’ve been thinking about the document stuff that I’ve been working on with Milton. One of the things that I had mooted as an idea was reading Twitter and pushing them back to the document. I’ve been playing with […]

Weeknotes: Ubuntu, messaging and Open Correspondence

It has been a while since the last weeknotes. I’ve finally made the move to Linux, or at least dual booting, by installing Ubuntu so I’m currently learning a little the OS and getting a development environment set up for it. I’ve nearly finsihed the ongoing accounts project at work. The framework is up and […]

Creating bibliographic resources from web pages

Given the increasingly digital nature of research, including not only websites but blogs, forums, wikis, the (in my view), beloved moleskin is becoming increasingly outdated. I’ve just finished writing my first book and had the joy of using moleskin notebooks to note down urls and make notes. I like moleskins a lot but pen and […]

Finding a space for NoSQL

ReadWriteWeb have a post on NoSQL (again?) by Audrey Watters which is a brief overview of the area.  The original post points the Heroku blog, where Adam Wiggins outlines the uses of NoSQL. I’m not an expert by any means but use Redis on a daily basis with the Rediska PHP library. I remember having […]