Category Archives: Information Retrieval

Interlinked communities gaming the search and social

Carole Cadwalladr’s article, “Google, democracy and the truth about internet search“, published in today’s Observer New Review section is a thoughtful piece about the way that Right wing politics (under what ever banner they wear today) have used the Web to push their agenda. Her opening paragraph about typing “are jews..” returned the equally appalling […]

Reusing material on social media

A hat tip to Kirsty Rolfe for favouriting this retweet from Sjoerd Levelt: ICYMI: the lawyers kindly updated their blog after they were informed of the nature of @CathalUK‘s @MedievalReacts. pic.twitter.com/8G37iiGJr2 — Sjoerd Levelt (@SLevelt) April 10, 2015 I highly recommend going to the tweet and viewing the conversation that led to this change. The […]

Harmonising the Heterogeneous at Cultures of Knowledge

Harmonising the Heterogeneous at the Cultures of Knowledge seminar series with Eero Hyvönen. Notes are unedited. Two forms of the Web : WWW for humans, GGG (Giant Global Graph) for data. Core data set 1048 data sets and 59 billion triples. Google’s Knowledge Graph and Microsoft’s Satori – graph engines in the search giants. Why […]

A glimpse into the wormhole

The High Scalability blog posted a link to Facebook’s new posts search system and the Facebook Notes written about it by a member of the engineering team. One of the sections mentioned the Wormhole publish/subscribe system that they developed to push data across multiple data centres in near real time. At a very basic level, […]

Exploring Charles Dickens’s networks

As part of the ongoing Open Correspondence rewrite, I’ve started working on some visualisations after a conversation with Rufus Pollock during one of the Humanities calls. One of the immediate ones was a force-directed graph to link all the correspondents to the authors. Well author at the moment. Although I am aware of SigmaJS, I […]

Thinking about texts and communities at Textcamp

Having gone to Textcamp yesterday, I started playing with Wordle and IBM’s Many Eyes at the suggestion of Dave Flanders of the JISC. As James Harriman-Smith, the organiser and Open Literature co-ordinator for the Open Knowledge Foundation, had suggested that this year is the anniversary of the manuscript of Alexander Pope‘s An Essay in Criticism, […]

Weeknotes: Documents and data

The main project this week (apart from hte onging one of moving and virtualising servers) is to begin work on our technical documents. I’m trying to move them onto the web and make the useful, not only in terms of reading about them but also to make them linkable. I’m trying to get them out […]

Research Databases in the Humanities

I went to the Research Databases in the Humanities workshop, organised by Sudamih, which was an excellent afternoon and time well spent. An Oxford heavy event, there were a number of interesting directions that came out of the afternoon. Firstly James Wilson, project manager of Sudamih at Oxford University Computing Services, outlined the Database as […]

Searching Open Correspondence with Xapian

As part of the continuing work on Open Correspondence, I managed to install Xapian to act as a full text search engine. I’ve been looking to do this for a while and had started on working on a remote back end (as blogged here) but decided not to use it as it appears to have […]

Finding the data signal in the noise

Marshall Kirkpatrick, on ReadWriteWeb, poses the question A web of infinite information: does that sound like a scary problem of “just too much”? in a “Mamas, Don’t Let Your Babies Grow Up to Be Data Wranglers” where he discusses an interview with Evan Williams on GigaOm. (I’m not going to discuss the interview here (but […]