Information Retrieval – The Aust Gate

Category Archives: Information Retrieval

Creating bibliographic resources from web pages

Given the increasingly digital nature of research, including not only websites but blogs, forums, wikis, the (in my view), beloved moleskin is becoming increasingly outdated. I’ve just finished writing my first book and had the joy of using moleskin notebooks to note down urls and make notes. I like moleskins a lot but pen and […]

August 15, 2010 – 6:52 pm | By iain_emsley | Posted in Information Retrieval, Open Knowledge, projects | Tagged archiving, warc | Comments (0)

Finding a space for NoSQL

ReadWriteWeb have a post on NoSQL (again?) by Audrey Watters which is a brief overview of the area. The original post points the Heroku blog, where Adam Wiggins outlines the uses of NoSQL. I’m not an expert by any means but use Redis on a daily basis with the Rediska PHP library. I remember having […]

July 20, 2010 – 7:11 pm | By iain_emsley | Posted in Information Retrieval | Tagged database, nosql, redis | Comments (0)

Weeknotes: Redis, PHP, mail and SOAP

I’ve spent some time writing a queueing library using Redis as a backend. I started with the notion that it would need to be a FIFO queue but didn’t want to only use the in-built parts of PHP as a stack using array_pop or array_push. Whilst it might be faster, it doesn’t allow for queue […]

June 6, 2010 – 11:05 am | By iain_emsley | Posted in Information Retrieval, projects | Tagged php, redis, soap | Comments (0)

Weeknotes: Data mining, XML and bibliographies

It seems to be have been a week of frantic completion and refactoring. The first half was spent frantically converting html pages into PDFs using Verypdf’s HTMLtools server product. All in all the manual is very helpful and the test server could be set up quickly. It might have helped the other end if I’d […]

May 23, 2010 – 10:57 am | By iain_emsley | Posted in Information Retrieval, Open Knowledge, projects | Tagged open_bibliography, open_correspondence, rdf, redis | Comments (0)

Data curation in real time

Robert Scoble’s blog has this intriguing post on real-time curation which has made me think. At the moment I’m working in curating and archiving gigabytes of information at work (and usually on ways of generating more data from the systems). Whilst this is not necessarily real time, I’d like it to be or at least […]

April 1, 2010 – 8:29 pm | By iain_emsley | Posted in Information Retrieval | Comments (0)

Textcamp announced

Had dinner with Rufus Pollock and Ben O’Steen on Monday in Oxford. As part of the dicussions, the notion of Textcamp was raised and Ben has created the Textcamp website with an associated blog. It is a slightly bigger concept than I had had but the approach, I think, will allow the creation of a […]

March 28, 2010 – 11:16 am | By iain_emsley | Posted in Information Retrieval, Text Mining | Tagged textcamp | Comments (0)

Exporting and querying Dickens data

As a follow up to the posting regarding the propsed ontology, I’ve started to try and create a SPARQL endpoint. At some point soon, I want to use the new version of ARC as the version I’ve got here is a little out of date. After that the next thing should be to allow the […]

March 21, 2010 – 12:15 pm | By iain_emsley | Posted in Information Retrieval, projects | Tagged charles dickens, rdf | Comments (0)

Growing and using data

Just seen an article on Techcrunch by Bradford Cross of Flightcaster regarding the growth of data on the Web. He appears to argue that data and its uses will drive the Web soon, writing: the data age is less about the raw size of your data, and more about the cool stuff you can do […]

March 17, 2010 – 7:57 pm | By iain_emsley | Posted in Information Retrieval | Tagged data mining | Comments (0)

Mining data driving the web?

March 17, 2010 – 7:54 pm | By iain_emsley | Posted in Information Retrieval, Text Mining | Tagged data sets | Comments (0)

Bibliographica – open bibliographic sourcing and maintenance

Jonathan Gray of the Open Knowledge Foundation has a thought provoking post on the need for an Open Bibliographic Service which he calls Bibliographica. As he writes: lists of publications are an absolutely critical part of scholarship. They articulate the contours of a body of knowledge, and define the scope and focus of scholarly enquiry […]

January 24, 2010 – 11:37 am | By iain_emsley | Posted in Information Retrieval, Open Knowledge | Tagged open_bibliography, open_service | Comments (1)

The Aust Gate

Category Archives: Information Retrieval

Creating bibliographic resources from web pages

Finding a space for NoSQL

Weeknotes: Redis, PHP, mail and SOAP

Weeknotes: Data mining, XML and bibliographies

Data curation in real time

Textcamp announced

Exporting and querying Dickens data

Growing and using data

Mining data driving the web?

Bibliographica – open bibliographic sourcing and maintenance

Elsewhere on the web

Categories

Archives

Search

Open Knowledge

RSS Feeds

Meta