Category Archives: Text Mining

A simple experiment in Sound and Vision for Hamlet

The aim of this hack is to explore turning the structures of the First Folio texts marked up using Text Encoding Initiative XML (TEI) into notes using the Chuck , PHP and Processing languages. I wanted to explore the processes for transforming the texts for the user and explore different ways of presenting the textual […]

Harmonising the Heterogeneous at Cultures of Knowledge

Harmonising the Heterogeneous at the Cultures of Knowledge seminar series with Eero Hyvönen. Notes are unedited. Two forms of the Web : WWW for humans, GGG (Giant Global Graph) for data. Core data set 1048 data sets and 59 billion triples. Google’s Knowledge Graph and Microsoft’s Satori – graph engines in the search giants. Why […]

Future of Editing – some reflections on Nicole Pohl on Sarah Scott

The seminar in today’s The Future of Editing series, “An Editor’s duty is indeed that of most danger’ (Piozzi): editing Sarah Robinson Scott“, by Nicole Pohl that the Bodleian Digital Library Systems and Services is holding at the Oxford e-Research Centre was a thought provoking one in terms the questions raised a series of points […]

Transcribing Bentham seminar notes

Melissa Terras talked about the Transcribing Bentham , a collaborative project to  transcribe the volumes of Bentham, at University College London at the first seminar in the Cultures of Knowledge seminars. Bentham believed in education for all who could afford it in London. UCL has 60,000 volumes and BL has 30,000. 40,000 volumes were untranscribed […]

A quick skim into mining Twitter data

This is a variant on the text prepared for a short talk at the Open Science evening at the Oxford e-Research Centre on Wednesday 27th November. Peter Murray-Rust also spoke at the event on the AMI software and the Chemical Tagger. This is a brief talk about some work that I have been doing in […]

Weeknotes – Scripting and scraping

It has been a while since I last posted a week note, so I thought I would try and get back in the habit. I’ve been involved in glueing together profiling tools to run so that I can have a vaguely generic framework to profile software at the IO level and the CPU level. Shell […]

Attending the Open Humanities Hack

I’ve just come back from a couple of excellent days of Humanities Hacking, organised by the King’s College, London Digital Humanities department and the Open Knowledge Foundation. To be fair, it went slightly differently than I thought it would. After an interesting start trying to find the room we were in, a few of us […]

Looking at mentions and users in a Twitter message

I was preparing for the recent OK Festival and discovered that the Weird Council was taking place; a conference on the awesome China Miéville. As you may guess, I am a bit of a fan. Unfortunately I was not aware that it had taken place so I watched it on Twitter. Whilst on my travels, […]

Thinking about texts and communities at Textcamp

Having gone to Textcamp yesterday, I started playing with Wordle and IBM’s Many Eyes at the suggestion of Dave Flanders of the JISC. As James Harriman-Smith, the organiser and Open Literature co-ordinator for the Open Knowledge Foundation, had suggested that this year is the anniversary of the manuscript of Alexander Pope‘s An Essay in Criticism, […]

Weeknotes: Open Correspondence toolkit and converting XML into JSON

I’ve been quiet for a bit though generally because I’ve been quite busy on projects and exploring ideas. After Book Hackday, I’ve written a post about beginning to develop the Open Correspondence toolkit for the Open Knowledge Foundation’s Notebook blog. I was also contacted regarding converting the TEI XML pages into JSON, which I am […]