Information Retrieval – The Aust Gate

Category Archives: Information Retrieval

Weeknotes: Redis, PHP, mail and SOAP

I’ve spent some time writing a queueing library using Redis as a backend. I started with the notion that it would need to be a FIFO queue but didn’t want to only use the in-built parts of PHP as a stack using array_pop or array_push. Whilst it might be faster, it doesn’t allow for queue […]

June 6, 2010 – 11:05 am | By iain_emsley | Posted in Information Retrieval, projects | Tagged php, redis, soap | Comments (0)

Weeknotes: Data mining, XML and bibliographies

It seems to be have been a week of frantic completion and refactoring. The first half was spent frantically converting html pages into PDFs using Verypdf’s HTMLtools server product. All in all the manual is very helpful and the test server could be set up quickly. It might have helped the other end if I’d […]

May 23, 2010 – 10:57 am | By iain_emsley | Posted in Information Retrieval, Open Knowledge, projects | Tagged open_bibliography, open_correspondence, rdf, redis | Comments (0)

Data curation in real time

Robert Scoble’s blog has this intriguing post on real-time curation which has made me think. At the moment I’m working in curating and archiving gigabytes of information at work (and usually on ways of generating more data from the systems). Whilst this is not necessarily real time, I’d like it to be or at least […]

April 1, 2010 – 8:29 pm | By iain_emsley | Posted in Information Retrieval | Comments (0)

Textcamp announced

Had dinner with Rufus Pollock and Ben O’Steen on Monday in Oxford. As part of the dicussions, the notion of Textcamp was raised and Ben has created the Textcamp website with an associated blog. It is a slightly bigger concept than I had had but the approach, I think, will allow the creation of a […]

March 28, 2010 – 11:16 am | By iain_emsley | Posted in Information Retrieval, Text Mining | Tagged textcamp | Comments (0)

Exporting and querying Dickens data

As a follow up to the posting regarding the propsed ontology, I’ve started to try and create a SPARQL endpoint. At some point soon, I want to use the new version of ARC as the version I’ve got here is a little out of date. After that the next thing should be to allow the […]

March 21, 2010 – 12:15 pm | By iain_emsley | Posted in Information Retrieval, projects | Tagged charles dickens, rdf | Comments (0)

Growing and using data

Just seen an article on Techcrunch by Bradford Cross of Flightcaster regarding the growth of data on the Web. He appears to argue that data and its uses will drive the Web soon, writing: the data age is less about the raw size of your data, and more about the cool stuff you can do […]

March 17, 2010 – 7:57 pm | By iain_emsley | Posted in Information Retrieval | Tagged data mining | Comments (0)

Mining data driving the web?

March 17, 2010 – 7:54 pm | By iain_emsley | Posted in Information Retrieval, Text Mining | Tagged data sets | Comments (0)

Bibliographica – open bibliographic sourcing and maintenance

Jonathan Gray of the Open Knowledge Foundation has a thought provoking post on the need for an Open Bibliographic Service which he calls Bibliographica. As he writes: lists of publications are an absolutely critical part of scholarship. They articulate the contours of a body of knowledge, and define the scope and focus of scholarly enquiry […]

January 24, 2010 – 11:37 am | By iain_emsley | Posted in Information Retrieval, Open Knowledge | Tagged open_bibliography, open_service | Comments (1)

Full text search using PHP and MySQL

I’ve been thinking about full text searching for the letters project and trying to find various solutions that are open source. On the Open Shakespeare and Open Milton sites, we used the Xapian project which is an excellent search engine. However I wanted to try and find a way of getting a search running using […]

December 29, 2009 – 7:38 pm | By iain_emsley | Posted in Information Retrieval | Tagged mysql, php | Comments (0)

Update on the Letters of Dickens

Just started on a new version of the Dickens letters which I’m trying to improve before adding in further volumes of text and other authors. I’ve refactored some of the code to remove some of the cruft and obsolescence. I’ve also been working on the rdf so that I can build up the RDFa links […]

November 22, 2009 – 10:54 am | By iain_emsley | Posted in Information Retrieval, Open Knowledge | Tagged letters | Comments (0)

The Aust Gate

Category Archives: Information Retrieval

Weeknotes: Redis, PHP, mail and SOAP

Weeknotes: Data mining, XML and bibliographies

Data curation in real time

Textcamp announced

Exporting and querying Dickens data

Growing and using data

Mining data driving the web?

Bibliographica – open bibliographic sourcing and maintenance

Full text search using PHP and MySQL

Update on the Letters of Dickens

Elsewhere on the web

Categories

Archives

Search

Open Knowledge

RSS Feeds

Meta