Mining the Letters of Charles Dickens

As an aside I’ve started  a small project to begin visualising ways of searching the letters of Charles Dickens and exploring the Simile library which MIT have produced.

Its originally an extension to the D-Space repository tool but Rufus Pollock used in the Open Knowledge Foundation’s Weaving History project – to which I contributed the Milton json data file. Originally I’d used it just for biographical timelines but thinking about it, I wondered how you could use it to mine datasets like the letters of Charles Dickens.

Dickens was a prolific letter writer (the Pilgrim edition extends to 12 thick volumes). I don’t have access to that data but I did download the first volume (of three) that his daughters edited.

Using Perl, I have extracted the date and recipient tags and converted the text file into JSON (as part of a larger process of converting the file into XML and using XSL to transform the data) and then created a table view of the data so that you can easily find the dates of the letters sent to certain people in tabular form.

I’ve also used the same data set to produce a fairly basic timeline of the letters which is being rewritten from here. It needs some rewriting to update to the new version of timeline.