Text Mining – The Aust Gate

Category Archives: Text Mining

Making Milton sparql

I’ve been going over some ideas that have been bubbling in my mind for a while about using RDF to load in further details about a test in question. I’ve gone back to an old Milton file, the Areopagitica, that I created for another project but never really used. Essentially its part of the Burke […]

October 26, 2010 – 8:56 pm | By iain_emsley | Posted in projects, Text Mining | Tagged open_literature, open_milton | Comments (0)

Installing Xapian into Open Correspondence and next steps

As an aid to getting over the first (and hopefully last) seasonal cold, I’ve been implementing Xapian as a search engine, using the Python bindings. I did look at Solr as an alternative but the set up costs outweighed the fact that Xapian is already installed on the server as part of Python. Unlike OpenMilton, […]

October 17, 2010 – 4:16 pm | By iain_emsley | Posted in projects, Text Mining | Tagged open_correspondence, search, xapian | Comments (1)

A change to the Letters project

During the previously blogged dinner with Ben and Rufus, we talked about the nascent work on the letters project. Both have “encouraged” me (it didn’t take too much persuasion, it must be said) to move the project to the Open Knowledge Foundation and to port it to Python with a Redis backend rather than the […]

March 28, 2010 – 11:19 am | By iain_emsley | Posted in Open Knowledge, projects, Text Mining | Tagged letters | Comments (0)

Textcamp announced

Had dinner with Rufus Pollock and Ben O’Steen on Monday in Oxford. As part of the dicussions, the notion of Textcamp was raised and Ben has created the Textcamp website with an associated blog. It is a slightly bigger concept than I had had but the approach, I think, will allow the creation of a […]

March 28, 2010 – 11:16 am | By iain_emsley | Posted in Information Retrieval, Text Mining | Tagged textcamp | Comments (0)

Mining data driving the web?

Just seen an article on Techcrunch by Bradford Cross of Flightcaster regarding the growth of data on the Web. He appears to argue that data and its uses will drive the Web soon, writing: the data age is less about the raw size of your data, and more about the cool stuff you can do […]

March 17, 2010 – 7:54 pm | By iain_emsley | Posted in Information Retrieval, Text Mining | Tagged data sets | Comments (0)

Letters of Charles Dickens website

I’ve finally posted the first draft of the Dickens website here: https://austgate.co.uk/dickens/index.php?author=Dickens. The idea is that it will allow users to derive networks across the a variety of Victorian authors as and when I can develop the datasets. I’ve also been developing a small text ontology to add to the Friend of a Friend (FOAF) […]

September 18, 2009 – 8:39 pm | By iain_emsley | Posted in Information Retrieval, Text Mining | Tagged charles dickens, letters | Comments (0)

Mining the Letters of Charles Dickens

As an aside I’ve started a small project to begin visualising ways of searching the letters of Charles Dickens and exploring the Simile library which MIT have produced. Its originally an extension to the D-Space repository tool but Rufus Pollock used in the Open Knowledge Foundation’s Weaving History project – to which I contributed the […]

July 14, 2009 – 7:41 am | By iain_emsley | Posted in Information Retrieval, Text Mining | Tagged charles dickens, simile | Comments (0)

Rethinking the idea of the “text”

Is a text really stable? Is it entity? In a lecture during my final year at the University of Leicester, one of the English lecturers posed a a question: What is a text? After soliciting various answers from the masses, he argued that a text is anything – email, note, manuscript and so on. So […]

May 22, 2009 – 8:07 am | By iain_emsley | Posted in Open Knowledge, Text Mining | Tagged text | Comments (0)

Building data stores

Mats Dahlstrom’s talk at the Dilemmas of Digitization conference mentioned the Deep Sharing: A Case for the Federated Digital library paper by Daivd Seaman. It would be great if there was a system for rapidly building small data stores from scratch to include texts and then have these with editing software components, text encoding output […]

July 6, 2008 – 12:32 pm | By iain_emsley | Posted in Information Retrieval, Text Mining | Comments (0)

The Aust Gate

Category Archives: Text Mining

Making Milton sparql

Installing Xapian into Open Correspondence and next steps

A change to the Letters project

Textcamp announced

Mining data driving the web?

Letters of Charles Dickens website

Mining the Letters of Charles Dickens

Rethinking the idea of the “text”

Building data stores

Elsewhere on the web

Categories

Archives

Search

Open Knowledge

RSS Feeds

Meta