Google are working with various publishing partners to digitize newspapers from, it appears, Canada and North America. As Punit Soni, the product manager for the programme, writes: “This effort is just the beginning. As we work with more and more publishers, we’ll move closer towards our goal of making those billions of pages of newsprint […]
Category Archives: Information Retrieval
Sourcing the attribution
Over on BoingBoing, Cory Doctorow has an excellent link to Danny O’Brien’s post on attribution for re-using works on the Internet. Attribution is, to my mind, one of the keys to creating a succesful and thriving remix environment. Why? At one level it is simple courtesy to mention where one gets the item that is […]
Storing data from blogs and wikis
Insitutional repositories already exist to store abstracts and documents. I was wondering if any of these have a way of storing blog posts or wiki pages and identifying their states; i.e. if a user was looking at a wiki page, they could see and archive edits to find its history. Whilst wikis do this as […]
UK government asks “Showusabetterway.co.uk”
The Guardian reports that its Free Our Data campaign took another step closer to its goal today. Tom Watson, currently the Cabinet Office minister, is one of the forces behind a competition with the first prize of £20,000 for the best use of non-personal public data available through Showusabetterway.
Getting vertigo retrieving information
Last week I went along to the ISKO UK seminar/event on Information Retrieval (IR) held at University College London. Brian Vickery gave a talk about the first fifty years or so of IR. Like any good event, I came away with loads to ponder. I’m still pondering some of my notes (I wish my handwriting […]
Building data stores
Mats Dahlstrom’s talk at the Dilemmas of Digitization conference mentioned the Deep Sharing: A Case for the Federated Digital library paper by Daivd Seaman. It would be great if there was a system for rapidly building small data stores from scratch to include texts and then have these with editing software components, text encoding output […]
Spelunking text data
One of the ARTFUL developers presented the PhiloLogic and its PhiloMine extension. Both are free text searching databases and tools. Both sets of code are designed for large sets of data which does raise the question whether it might be useful to develop a set of tools for smaller data holdings or individuals.
Communities of repositories
I was recently at the Dilemmas of Digitization conference held at the Maison Francaise in Oxford and organised by the Cost 32 group, a project looking at creating open scholarly communities online across Europe. One of the points that interested me is the idea that repositories need to develop services of their own to the […]