August 12th, 2008
I’ve nearly finished the Milton “patch” to Shakespeare and will be loading it to the svn tonight. The all new Shakespeare runs on Pylons and looks rather nice.
One of the next things that I want to look at is it the idea of making the interface machine readable and the use LinkedData (good tutorial on Chris Bizer’s site) to try and establish the beginnings of an Open Knowledge Web of Data.
As per a previous post, one of the issues that comes up is licensing - how does one force the data out there rather than wait for it to be found?
Tags: open_milton
Posted in Open Knowledge | No Comments »
August 12th, 2008
Over on BoingBoing, Cory Doctorow has an excellent link to Danny O’Brien’s post on attribution for re-using works on the Internet.
Attribution is, to my mind, one of the keys to creating a succesful and thriving remix environment. Why? At one level it is simple courtesy to mention where one gets the item that is being remixed with - a simple link back to say “thanks, this is where I got it from” cannot hurt. It could also extend links to artists and eventually come around in the form of blog or hyperlinks.
Cory mentioned in his link that when the original CC license went up, most users chose the attribution choice which hints that even if giving something away as share-alike, the artist would at least like the recognition.
It is a fundamental thing in the idea of plagiarism as well which is an issue in universities and elsewhere. Quite simply, a reader is probably not going to come up with the most original idea themselves and the savvy teachers are pretty clued in to where the ideas are. Just come out and cite the source. Argue with it or agree wholeheartedly but don’t steal it.
So how to change it for the better. One idea might be to educate and encourage users in schools how to properly credit the source and do so on a regular basis. The Data Web should also clearly force licenses out in to the machine editable sphere and interfaces be drawn up to accept licensing. That might help with research and ensuring that a user is aware of what they can and cannot do and also has to ignore the attribution data given.
It is a deeper issue than IP law and copyright. Its a cultural change and one where we need more carrots to do well than sticks.
Posted in Information Retrieval, Open Knowledge | No Comments »
July 31st, 2008
The Guardian have a couple of articles which have a relevance to the notion of creative openness. Cory Doctorow extends the copyleft argument to the recent agreement between ISPs and the BPI whilst Keith Stuart explores how the games industry have dealt with piracy.
Cory Doctorow’s article uses the recent agreement between the ISPs and the music industry to point out that the real criminals will now find other outlets and go deeper underground, presumably further developing their own darknet for filesharing. All this agreement will do is to annoy/hack-off the very people who may share some tracks but go and buy a download later or go and do something creative with it. If the music industry (and quite possibly the film industry) were bothered with creativity and making a viable industry in the future, then they would be developing platforms and getting involved at the grass roots level.
Indeed this is what the flash games industry is doing for itself, according to Keith’s post on the Gamesblog. Rather than trying sue a set of shadows, they have explored ways of making these some money (not much unless you’re really popular) from the associated revenue streams, such as advertising or in-game items/levels. If a game is pirated, the creator can still get some revenue for themselves through these mechanisms.
It boggles the mind how one industry can so clearly get it and work with it, whilst another stumbles aimlessly around trying to justfiy its current existence.
However, if you go sideways, there are some intriguing parallels to this. Freeing data and knowledge sets allows an individual to come up with and explore new ideas. It also means that potentialy some revenue will be lost if there are charges involved. Well its going to happen any way but one might as well accept this and work on ways of making the original source more appealing and useful.
Posted in Open Knowledge | No Comments »
July 23rd, 2008
A thought. Given the amount of IM and chat clients, how do we store any knowledge across that is being transferred? Is it be lost or can you “dump” the logs for later use?
A similar thing must be happening with SMS. I would have thought that the providers store these but can we get hold of them? Are there interfaces to dump the information for personal use or is it only in companies data stores?
Tags: storage
Posted in Open Knowledge | No Comments »
July 23rd, 2008
Insitutional repositories already exist to store abstracts and documents. I was wondering if any of these have a way of storing blog posts or wiki pages and identifying their states; i.e. if a user was looking at a wiki page, they could see and archive edits to find its history.
Whilst wikis do this as the talk page, would you then need to store the edit data differently as drafts inside the repository so that future users can immediately identify the changes and either inspect or ignore them? Would repositories need to develop their own blog search or leverage Google’s BlogSearch and Technorati?
Tags: blogging, Open Knowledge, repositories, wiki
Posted in Information Retrieval | No Comments »
July 23rd, 2008
Chris Saad has announced that the Open Web Foundation is being set up to aid in the governance of data portability technologies on his blog.
The Data Portability group has done a sterling job in evangelising and ensuring that their ideas are on the roadmap. The data silos are gradually being brought together (though I wonder if services like JISCmail and insitutional repositories should also be joining in).
I truly hope that the new Foundation is receptive and also welcomes the academic “market” and conitinues and extends efforts to leverage the web as an open platform for sharing.
Posted in Open Knowledge | No Comments »
July 10th, 2008
Bobbie Johnson has interviewed Tim Berners-Lee for the Guardian about the new subject of web science - study of how the Web works and the way it works. Both MIT and the University of Southampton are championing the Web Science Research Initiative.
As the article says, the Web needs to remain free and open if it is to achieve its potential and to avoid being broken up or controlled by repressive regimes.
Tags: W3
Posted in Open Knowledge | No Comments »
July 10th, 2008
The Guardian reports that its Free Our Data campaign took another step closer to its goal today. Tom Watson, currently the Cabinet Office minister, is one of the forces behind a competition with the first prize of £20,000 for the best use of non-personal public data available through Showusabetterway.
Tags: freeourdata
Posted in Information Retrieval | No Comments »
July 6th, 2008
The Open Knowledge Foundation are bringing the Open Service Definition to version 1.0 which is a helpful step. I wholeheartedly agree with it. As services and APIs develop, we need to create a legal framework within which data, knowledge and dissemination services can be used to allow greater access to open knowledge now rather than when silos have been built.
However I believe that it needs an addition to the first clause: freedom of data access.
Any methodology by which data has been transformed or is generated should be clearly explained so that, if necessary, results can be replicated. The transparency of this would allow commercial and educational users a greater confidence in the data presented.
Perhaps it is more along the lines of Open Knowledge Definition but I think it is an important point to make clear rather than leaving it implicit.
Tags: open_service
Posted in Open Knowledge | No Comments »
July 6th, 2008
Last week I went along to the ISKO UK seminar/event on Information Retrieval (IR) held at University College London.
Brian Vickery gave a talk about the first fifty years or so of IR.
Like any good event, I came away with loads to ponder. I’m still pondering some of my notes (I wish my handwriting as neater…)
Stephen Robertson of the Microsoft Research lab talked about where search was beginning to go and what was being explored by companies such as Microsoft and Google.
During the Q&A session, there seemed to be a theme of users using Google and Yahoo et al for quick references, such as three word searches. To some extent this is probably for the ease of the interface but…
What I really got out of this was an intriguing thought.
Google et al are good at general searches. They can find vast amounts of data quickly and easily and presnnt them via the algorithm to the user. Their search is horizontal.
Yet repositories can contain vast amounts of better search data than the search engines can create using controlled vocabularies, RDF, RDFa and so on. They have people writing the classifications for them who know the subject well (we hope) and can make more rational judgments than a search engine. So if repositories and data stores could come together and leverage their inherantly more detailed vertical search via XML and RDF interfaces to link to each other and allow the search engines rapid access to the relevant data. Their search is vertical.
The experts in the field will probably already know where the relevant stores are but casual and non-acadmic users will not. Nor are they likely to take them time to delve through advanced searches. We are time pressured. The vertical search engines may well not have the resources as our large search friends but a few adaptations should allow better access and also lever the knowledge into a more public sphere.
Posted in Information Retrieval | No Comments »