<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Aust Gate &#187; Open Knowledge</title>
	<atom:link href="http://austgate.co.uk/category/openknowledge/feed/" rel="self" type="application/rss+xml" />
	<link>http://austgate.co.uk</link>
	<description>Open Knowledge and Literature</description>
	<lastBuildDate>Sun, 25 Jul 2010 15:19:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>BBC&#8217;s use of Semantic Web technology in World Cup</title>
		<link>http://austgate.co.uk/2010/07/bbcs-use-of-semantic-web-technology-in-world-cup/</link>
		<comments>http://austgate.co.uk/2010/07/bbcs-use-of-semantic-web-technology-in-world-cup/#comments</comments>
		<pubDate>Tue, 13 Jul 2010 19:27:31 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Open Knowledge]]></category>
		<category><![CDATA[bbc]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[rdf]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=185</guid>
		<description><![CDATA[Just caught this story on ReadWrite Web about the BBC website&#8217;s use of semantic web technology during the World Cup.  Jem Rayfield explains more on the BBC Internet blog about the use of technology. I&#8217;ve still got a fair amount of reading to do but this is the sort of project that makes me rethink [...]]]></description>
			<content:encoded><![CDATA[<p>Just caught this story on ReadWrite Web about the<a title="ReadWriteWeb on BBC's Semantic Web" href="http://www.readwriteweb.com/archives/bbc_world_cup_website_semantic_technology.php" target="_blank"> BBC website&#8217;s use of semantic web technology</a> during the World Cup.  Jem Rayfield explains more on the <a title="Jem Rayfield talking about the BBc use of semantic web on BBC Sport site" href="http://www.bbc.co.uk/blogs/bbcinternet/2010/07/bbc_world_cup_2010_dynamic_sem.html" target="_blank">BBC Internet </a>blog about the use of technology.</p>
<p>I&#8217;ve still got a fair amount of reading to do but this is the sort of  project that makes me rethink the Open Letters project and how it could  be used by other sites. It has also given me food for thought for work as well.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/07/bbcs-use-of-semantic-web-technology-in-world-cup/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: documentation, prototyping and cats</title>
		<link>http://austgate.co.uk/2010/07/weeknotes-documentation-prototyping/</link>
		<comments>http://austgate.co.uk/2010/07/weeknotes-documentation-prototyping/#comments</comments>
		<pubDate>Sun, 11 Jul 2010 15:31:20 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Open Knowledge]]></category>
		<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=181</guid>
		<description><![CDATA[I&#8217;ve spent most of the week either trying to persuade colleagues that rewrites are needed to existing services. I&#8217;ve also finally managed to get the initial promise of working from home so hopefully I&#8217;ll be able to get the rewrite started on the &#8220;quiet&#8221; days away from the office. (Although the cat can drive me [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve spent most of the week either trying to persuade colleagues that rewrites are needed to existing services. I&#8217;ve also finally managed to get the initial promise of working from home so hopefully I&#8217;ll be able to get the rewrite started on the &#8220;quiet&#8221; days away from the office. (Although the cat can drive me nuts before she goes to sleep at 10am).</p>
<p>Still working on the accounts project which keeps unravelling a series of underlying problems. Most of them we know about but they appear in all sorts of odd places.</p>
<p>Assuming the world doesn&#8217;t fall on my head next time I&#8217;m in the office, I&#8217;m going to try and spend the day at home on a &#8220;Fedex&#8221; day. I&#8217;m taking the notion from an issue of Wired where they were talking about different ways of working and Atlassian mentioned &#8220;Fedex&#8221; days where you spend a day building a prototype. What I&#8217;d really like to get prototyped is the service bus / queuing system. So fingers crossed.</p>
<p>The impetus came from updating the disaster recovery documentation and writing the first department of the service status documentation (which I wrote after getting the last bit of debugging finished). I know that documentation is not everybody&#8217;s favourite thing but I find it useful in rethinking the system and making sure it fits together.</p>
<p>I&#8217;ve made time to rewrite the load function for Open Letters. I&#8217;ve got the document building the letters in XML and written a rough upload script. Next task is to rewrite the main.py script, test the XML loading and then finished tidying up the initial document.</p>
<p>I&#8217;m also looking forward to Textcamp so it&#8217;ll be great to get the load finished (as it normalises the function) and get on with doing a presentation for the camp.</p>
<p>I&#8217;m also coming to end of writing my book on children&#8217;s fantasy. Whilst not technical in an IT sense, I&#8217;m thinking of the next project on the New Weird and how to use IT to visualise influences and timelines. The one that worries me is archiving necessary web pages for the research which I need to look towards as I&#8217;m not sure whether it is technically illegal.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/07/weeknotes-documentation-prototyping/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Pylons, Python and printing</title>
		<link>http://austgate.co.uk/2010/05/weeknotes-pylons-python-and-printing/</link>
		<comments>http://austgate.co.uk/2010/05/weeknotes-pylons-python-and-printing/#comments</comments>
		<pubDate>Sun, 30 May 2010 10:22:41 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Open Knowledge]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[open_literature]]></category>
		<category><![CDATA[printing]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=157</guid>
		<description><![CDATA[I&#8217;ve been doing some more work to the Open Correspondence website (which is now functional  thanks to Rufus Pollock&#8217;s help). In part I&#8217;ve been cleaning up the urls for the data controller (which is still coming along) and trying to tie the views in together. Being happier with Apache and PHP I spent some time [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been doing some more work to the Open Correspondence website (which is now functional  thanks to Rufus Pollock&#8217;s help). In part I&#8217;ve been cleaning up the urls for the data controller (which is still coming along) and trying to tie the views in together. Being happier with Apache and PHP I spent some time looking for how to rewrite the urls until I came across <a title="Andre Kollel on Pylons" href="http://blog.andrekolell.de/2009/04/26/the-pylons-web-framework/" target="_blank">Andre Kollel&#8217;s blog post</a> about the internal workings of the middleware in the <a title="Pylons framework" href="http://pylonshq.com/" target="_blank">Pylons framework</a>.  The more I do on the project, the more I learn about both Python and Pylons.</p>
<p>One of the next things to do is to reformat the dates into human readable format. I had thought of using Python&#8217;s <a title="python's date time module" href="http://docs.python.org/library/datetime.html" target="_blank">datetime</a> strftime to reformat the date from its current ISO format (YYYY-MM-DD) into day, month year. Unfortunately, the method states &#8221; years before 1900 cannot be used.&#8221; A slight cramp in the plan. However there is an <a title="Andrew Dalke's Activestate date recipe" href="http://code.activestate.com/recipes/306860-proleptic-gregorian-dates-and-strftime-before-1900/" target="_blank">Activestate recipe</a> by Andrew Dalke which might do the trick or at least point me in the right direction. It is one of the things to be tidied up at some point.</p>
<p>It is a good feeling to have the site running now. The next task is to write the tests and then  to refactor the code. It is very PHPish and needs to be made more Pythonic. I&#8217;ve got an idea for trying to create a dendrogram around the textReferred element and to discover the letters and correspondents around the books that Dickens was writing. One of the tings is to continue loading the other volumes of Dickens&#8217;s letters into the site. So version 0.2 is a little way off but the light at the end of the tunnel is not a train this time.</p>
<p>Workwise has been a little hectic. I must make some time to write a method to allow our admin team to resubmit applications. Like so many things it is a balance between a five minute job and the two hour ones that need to be done. The major job for the week though was getting the automated printing working.</p>
<p>One of the jobs that admin do is to go through each client and create the packs for them. Using HTMLtools, I&#8217;ve managed to compile the html into PDF and then convert the PDF into a PostScript file for a printer. I&#8217;ve managed to use the <a title="Line Printer Remote protocol wikipedia page" href="http://en.wikipedia.org/wiki/Line_Printer_Remote" target="_blank">Line Printer Remote</a> protocol to send the job to the printer. It is a simple enough command:</p>
<p>lpr -S &lt;ip address/name of printer&gt;  -P &lt;name of print job&gt; (-o &lt;optional -o 1 sets file to binary&gt;) &lt;name of file&gt;</p>
<p>Windows doesn&#8217;t appear to support the full protocol but enough to be useful. The -o switch appears to only define whether the file is binary or not rather than specifying the paper type and so on. Annoying but it can be got around.</p>
<p>Anyhow it got me thinking about other ways of using commands to explore how texts can be converted and changed into useful objects. It brings me back to the use of psbook for printing but how to make it useful for an average user who does not necessarily want to run various commands. Having had a conversation with my friend Darren Nash ,editorial director of Orbit books,  about the future of publishing; he opined that small presses would come to the fore. I think, certainly in genre that this is correct. It would be interesting to see how existing tools could be used towards these ends rather than constantly re-invent the wheel.</p>
<p>Now that the first version of letters is out the way, time to go over other projects. I&#8217;ve got a yen to try and create something from Milton&#8217;s <a title="Wikipedia on the Areopagitica" href="http://en.wikipedia.org/wiki/Areopagitica" target="_blank">Areopagitica</a>, appropriate I think as it is a cry for free presses.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/05/weeknotes-pylons-python-and-printing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Data mining, XML and bibliographies</title>
		<link>http://austgate.co.uk/2010/05/weeknotes-data-mining-xml-and-bibliographies/</link>
		<comments>http://austgate.co.uk/2010/05/weeknotes-data-mining-xml-and-bibliographies/#comments</comments>
		<pubDate>Sun, 23 May 2010 10:57:25 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Open Knowledge]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[open_bibliography]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[redis]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=155</guid>
		<description><![CDATA[It seems to be have been a week of frantic completion and refactoring. The first half was spent frantically converting html pages into PDFs using Verypdf&#8217;s HTMLtools server product. All in all the manual is very helpful and the test server could be set up quickly. It might have helped the other end if I&#8217;d [...]]]></description>
			<content:encoded><![CDATA[<p>It seems to be have been a week of frantic completion and refactoring.</p>
<p>The first half was spent frantically converting html pages into PDFs using Verypdf&#8217;s<a title="VeryPDF htmltools command line manual" href="http://www.verypdf.com/htmltools/html-tools.html" target="_blank"> HTMLtools</a> server product. All in all the manual is very helpful and the test server could be set up quickly. It might have helped the other end if I&#8217;d remembered to break the file up for printing but that turned out to be a 10 minute jub to put back into production. The next task is to transfer it from the test server and onto the production one but that&#8217;ll need to wait for networking to tweak it a little.</p>
<p>I spent some time refactoring the call recordings archive. For some reason the archiving solution that I hacked up in November decided to start failing in March after it was changed. Despite being put back to its original state it never quite got back to working as it did. I&#8217;ve been trying to tweak it ridon and off but never found the time to complete it. I finally just made the time on friday afternoon to look at it properly. I&#8217;d been thinking about item based filtering after reading the first chapter of Toby Segaran&#8217;s <a title="OReilly page for Programming Collective Intelligence" href="http://oreilly.com/catalog/9780596529321/" target="_blank">Programming Collective Intelligence</a>. (On the back of this, I think I&#8217;ll be buying his <a title="O'Reilly page for Beautiful Data" href="http://oreilly.com/catalog/9780596157128/" target="_blank">Beautiful Data</a> at some point.)  Although this is not really an intelligent programme as such, the techniques have shown some real promise in the hurried tests. Using a Redis datastore, the percentage of found recordings is way up. Fingers crossed for Monday morning when I can see what the scripts run over the weekend. I also spent some time simplifying the matching algorithm so that I didn&#8217;t have to account for so many edge cases when dealing with time.</p>
<p>It seems that we are approaching some sort of real-time status update systems at work. I&#8217;ve sort of been arguing for this for a while to remove the bottlenecks of having each system dependant on another one. One of our suppliers is sending us XML data so I&#8217;ve been playing with Xpath 1.0 (since Xpath 2.0 apparently isn&#8217;t directly supported by PHP but there might be a way of passing the data to Java which adds unnecessary overhead) to extract the relevant values. Anyhow the core is running but I still need to fully test it and add in security.</p>
<p>I&#8217;ve also been asked to design and implement a queueing system for the main internal server. I&#8217;ve run up a quick high level overview but the detail still needs to be worked on. I&#8217;m pushing it back to June so that I can slear the decks of the older projects that are still on the board.</p>
<p>I had a chat with <a title="Jonathan Gray's blog" href="http://jonathangray.org/" target="_blank">Jonathan Gray</a>, a sound guy who does far too much, about digital humanities ideas. We&#8217;ve agreed to keep closer contact with each other about the area and to encourage each other into actually doing stuff (I have half a moleskin of ideas &#8211; time for more code, less talk then).  He proposed the <a title="Jonathan Gray on Bibliographica" href="http://austgate.co.uk/2010/01/bibliographica-open-bibliographic-sourcing-and-maintenance/" target="_blank">Bibliographica idea</a> in January and the team wrote <a title="Bibliographican entry on the blog" href="http://blog.okfn.org/2010/05/20/bibliographica-an-introduction/" target="_blank">a blog entry</a> for the Open Knowledge Foundation blog. It is an idea that I&#8217;m looking forward to playing with and trying to embed data from. (<a href="http://bibliographica.org/">http://bibliographica.org/</a>)</p>
<p>One of the things that I&#8217;ve been thinking about though is increasingly when we do research, we store  web pages, blog entries and so on. Whilst there is way of recording these in a footnote (http:example.org accessed on &lt;insert data&gt; type thing), there does not appear to be a way of building a local archive of these with the relevant metadata for later retrieval, Don&#8217;t know about anybody else but I&#8217;ve got a fair few pages dotted around my hard drive for projects and I&#8217;d like a way of storing these properly and to be able to integrate them into bibliographies or research notes. I know that there is WARC format (<a title="Library of Congress on WARC" href="http://www.digitalpreservation.gov/formats/fdd/fdd000236.shtml" target="_blank">Library of Congress</a> link and the <a title="WARC tools on Google code" href="http://code.google.com/p/warc-tools/" target="_blank">WARC tools</a> Google code project) to play with so need to make time to do that.</p>
<p>I had a mini-hack on the Open Correspondence project last Sunday intending to update a couple of pages and got a little more done than that. The database needs rebuilding but the purl reference (<a title="Letter schema PURL" href="http://purl.org/letter" target="_blank">http://purl.org/letter</a>) now points to the schema. It is so close that I can&#8217;t wait to actually start hacking the data. Time to do the last little bits like tidy up the parser, use the weaving history API to embed a timeline and start using <a title="jena sourceforge archive" href="http://jena.sourceforge.net/" target="_blank">JENA</a>, <a title="ARC website" href="http://arc.semsol.org" target="_blank">ARC</a> and Chris Gutteridge&#8217;s <a title="Graphite rdf library" href="http://graphite.ecs.soton.ac.uk/" target="_blank">Graphite</a> library which worked out of the box (but as yet I haven&#8217;t entirely used it for much yet).</p>
<p>Goals for this week are to finish the Open Correspondence bits, update the trac instance with the various &#8216;todo&#8217;s, write a blog post for the Open Knowledge Foundation for Open Correspondence, do some major testing this week at work on various XML exports and imports. I should just be about caught up then. With any luck&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/05/weeknotes-data-mining-xml-and-bibliographies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Redis, RDF, rdflib and openletters</title>
		<link>http://austgate.co.uk/2010/05/weeknotes-redis-rdf-rdflib-and-openletters/</link>
		<comments>http://austgate.co.uk/2010/05/weeknotes-redis-rdf-rdflib-and-openletters/#comments</comments>
		<pubDate>Sat, 15 May 2010 14:57:14 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Open Knowledge]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[redis]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=152</guid>
		<description><![CDATA[I&#8217;ve been trying to play catch up this week at work. One of the projects that I&#8217;ve been working on is the temporary storage of information. For one reason or another, one of the workers has decided to occasionally throw a fit and not do its job properly (on top of a connection that appears [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been trying to play catch up this week at work.</p>
<p>One of the projects that I&#8217;ve been working on is the temporary storage of information. For one reason or another, one of the workers has decided to occasionally throw a fit and not do its job properly (on top of a connection that appears to fail at odd times). What I really needed was a temporary store to save the parsed information so that if something failed, we didn&#8217;t loose everything. To that end, I&#8217;ve started looking at <a title="Redis code base" href="http://code.google.com/p/redis/" target="_blank">Redis</a> in more detail and started using the Windows build of version 1.2.1 (available on <a title="Aspninja and redis" href="http://www.aspninja.com/2010/01/23/using-redis-on-asp-net-example-twitter-clone-retwis-c/" target="_blank">aspninja.com</a>) with the <a title="Rediska library" href="http://rediska.geometria-lab.net/" target="_blank">Rediska</a> library. At some point I&#8217;ll sit down and compile it on my laptop under Cygwin to get the latest version.</p>
<p>I ended up using the PEAR version of Rediska and managed to get it up and running fairly quickly. One of the things that I needed to do was to call a new instance of the list that I was creating in each method, having split the set and get methods into two workers. The speed of Redis is fantastic and the server happily runs on the test server caching the data and allowing another worker to load into a copy of the MySQL tables that it will eventually update. I found the Rediska library really easy to use and I&#8217;ll be using it for various projects at home to do some processing rather than using MySQL all the time. <a title="Simon Willison on redis" href="http://simonwillison.net/2010/Apr/25/redis/" target="_blank">Simon Willison</a> has a post which links to <a title="Simon Willison on redis" href="http://simonwillison.net/static/2010/redis-tutorial/" target="_blank">a tutorial on Redis</a> that I found extremely useful and encouraging in finding more about the server in future.</p>
<p>I&#8217;ve been working on the RDF exports for the <a title="Open Correspondence website" href="http://opencorrespondence.org" target="_blank">open letters</a> project which are yet to go live. The main job has been making sure that the exports validate using the RDF validator and pulling in the data. A future task is to finish tidying up the data but I&#8217;m trying to get the letter html template figured out. Since Python isn&#8217;t the main language that I know use (work is entirely based on PHP), I&#8217;ve been taking a look at the <a title="open shakespeare website" href="http://openshakespeare.org" target="_blank">Open Shakespeare</a> code and found that RDFa work that I worked on a year ago and completely forgotten about. It would be good to get RDFa into open correspondence but I think that is a later task. Main thing is to complete the initial port. I managed to get the www.purl.org/letter forwarding to the site but need to get a schema page up and the purl correctly referring to the right page.</p>
<p>One of things that I&#8217;ve been trying to play with <a title="rdflib python library" href="http://code.google.com/p/rdflib/" target="_blank">RDFlib</a> on Windows. I built it successfully on my last laptop (Windows XP, Cygwin) but for some reason version 2.4.2 would not build on Vista, even under easy install. I&#8217;ve been trying with the version 3 (which has just been released on may 13th according to the news group) and apparently the <a title="rdfextras project" href="http://code.google.com/p/rdfextras/" target="_blank">rdfextras</a> project has a pure Python version of the Sparql parser which was failing to build. I&#8217;ll be trying that once the current work on open correspondent as been completed to explore what we can do with the data.</p>
<p>Ben O&#8217;Steen talked at the Open Knowledge conference after me and one of the things he talked about was the psutils package. I&#8217;ve found it on<a title="Cygwin site" href="http://www.cygwin.com" target="_blank"> Cygwin</a> and downloaded it so it would be good to have fun with that one or to find accessible <a title="PSUtils Windows port" href="http://gnuwin32.sourceforge.net/packages/psutils.htm" target="_blank">Windows ports</a> for people who don&#8217;t necessarily want to download Cygwin.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/05/weeknotes-redis-rdf-rdflib-and-openletters/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Date set for Textcamp</title>
		<link>http://austgate.co.uk/2010/05/date-set-for-textcamp/</link>
		<comments>http://austgate.co.uk/2010/05/date-set-for-textcamp/#comments</comments>
		<pubDate>Wed, 05 May 2010 08:45:26 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Open Knowledge]]></category>
		<category><![CDATA[open_literature]]></category>
		<category><![CDATA[textcamp]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=150</guid>
		<description><![CDATA[The provisional date for Textcamp has been set for August 21st on the twitter feed.]]></description>
			<content:encoded><![CDATA[<p>The provisional date for <a title="Textcamp website" href="http://textcamp.org/index.php/Main_Page" target="_blank">Textcamp</a> has been set for August 21st on the <a title="Textcamp twitter feed" href="http://twitter.com/textcamp" target="_blank">twitter feed</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/05/date-set-for-textcamp/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A change to the Letters project</title>
		<link>http://austgate.co.uk/2010/03/a-change-to-the-letters-project/</link>
		<comments>http://austgate.co.uk/2010/03/a-change-to-the-letters-project/#comments</comments>
		<pubDate>Sun, 28 Mar 2010 11:19:55 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Open Knowledge]]></category>
		<category><![CDATA[Text Mining]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[letters]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=145</guid>
		<description><![CDATA[During the previously blogged dinner with Ben and Rufus, we talked about the nascent work on the letters project. Both have &#8220;encouraged&#8221; me (it didn&#8217;t take too much persuasion, it must be said) to move the project to the Open Knowledge Foundation and to port it to Python with a Redis backend rather than the [...]]]></description>
			<content:encoded><![CDATA[<p>During the previously blogged dinner with Ben and Rufus, we talked about the nascent work on the letters project. Both have &#8220;encouraged&#8221; me (it didn&#8217;t take too much persuasion, it must be said) to move the project to the Open Knowledge Foundation and to port it to Python with a Redis backend rather than the current PHP/MySQL set up. I hope that the move will be complete soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/03/a-change-to-the-letters-project/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Textcamp announced</title>
		<link>http://austgate.co.uk/2010/03/textcamp-announced/</link>
		<comments>http://austgate.co.uk/2010/03/textcamp-announced/#comments</comments>
		<pubDate>Sun, 28 Mar 2010 11:16:22 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Text Mining]]></category>
		<category><![CDATA[textcamp]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=143</guid>
		<description><![CDATA[Had dinner with Rufus Pollock and Ben O&#8217;Steen on Monday in Oxford. As part of the dicussions, the notion of Textcamp was raised and Ben has created the Textcamp website with an associated blog. It is a slightly bigger concept than I had had but the approach, I think, will allow the creation of a [...]]]></description>
			<content:encoded><![CDATA[<p>Had dinner with <a title="Rufus Pollock's website" href="http://www.rufuspollock.org/" target="_blank">Rufus Pollock</a> and <a title="Ben O'Steen's blog" href="http://oxfordrepo.blogspot.com/" target="_blank">Ben O&#8217;Steen</a> on Monday in Oxford. As part of the dicussions, the notion of Textcamp was raised and Ben has created the <a title="Textcamp website" href="http://textcamp.org/" target="_blank">Textcamp website</a> with an associated <a title="Textcamp blog" href="http://blog.textcamp.org/" target="_blank">blog</a>. It is a slightly bigger concept than I had had but the approach, I think, will allow the creation of a wider community and a place to publicly follow up any ideas that get thrown up. I like the idea of hacking texts as well and it will be great to have a place to discuss ideas and to learn. Equally Ben&#8217;s post makes it clear that it should be friendly and helpful leading up to a Barcamp style event. It is slated to run in August or September. I can&#8217;t wait.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/03/textcamp-announced/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating the text ontology</title>
		<link>http://austgate.co.uk/2010/03/creating-the-text-ontology/</link>
		<comments>http://austgate.co.uk/2010/03/creating-the-text-ontology/#comments</comments>
		<pubDate>Thu, 18 Mar 2010 20:34:06 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Open Knowledge]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[rdf]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=132</guid>
		<description><![CDATA[I&#8217;ve been working quietly on ideas for an ontology to describe relationships in  a letter from the correspondent to people referred in the text. It is intended to complement and extend the Dublin Core and Foaf (Friend of a Friend) namespaces. Anyhow I&#8217;ve decided to publish a first set of thoughts on it having sat [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been working quietly on ideas for an ontology to describe relationships in  a letter from the correspondent to people referred in the text. It is intended to complement and extend the Dublin Core and Foaf (Friend of a Friend) namespaces. Anyhow I&#8217;ve decided to publish a first set of thoughts on it having sat on the project for a while.I&#8217;ve sort of thought of it as using the text namespace in the text, which I currently doing, but it is not set in stone.</p>
<p>Simple Ontology for Relationships in Texts</p>
<p>Text namespace</p>
<p>austgate.co.uk/ontology/text</p>
<p>Definition: An ontology which allows for the linking text items, such as letters, together. It extends and complements Dublin Core (DC) and Friend of a Friend (FOAF).</p>
<p>Terms</p>
<p>Appearsin</p>
<p>The term is used to denote a work in which a character appears. For example:<br />
Dear Alice,</p>
<p>As you may know I am coming to the end of the latest draft of the Ponsonby diaries. Bob Ponsonby is making his way across the marshes&#8230;</p>
<p>The character Bob Ponsonby could be referenced as text:Appearsin to denote his appearance in the work. This allows queries to find documents where the characters from a work appear, rather than just individual characters. It would usually be considered as a collection of text:Character references.</p>
<p>Character</p>
<p>A fictional person who is referenced in the text. This element is used to disambiguated between fictional and non-fictional characters. Non-fictional, i.e. real people, are denoted by foaf:Person. Character is a subset of foaf:Person and is intended for fictional people. For example, in a letter from an author to an agent, the author may describing their latest project.</p>
<p>Dear Alice,</p>
<p>As you may know I am coming to the end of the latest draft of the Ponsonby diaries. Bob Ponsonby is making his way across the marshes&#8230;</p>
<p>In the example, Alice is a real person and could be denoted as such by using foaf:Person but Bob Ponsonby is equally a name and a person. Since he is fictional in this letter, he could be denoted as  text:Character in any RDF representation to allow users to link documents where the character is mentioned.</p>
<p>&lt;text:character<br />
rdf:ID=&#8221;http://austgate.co.uk/Dickens/characters/pickwick&#8221;&gt;<br />
&lt;foaf:name&gt;Mr. Pickwick&lt;/foaf:name&gt;<br />
&lt;text:appearsin<br />
rdf:resource=&#8221;http://austgate.co.uk/Dickens/works/pickwickpapers&#8221; /&gt;<br />
&lt;/text:character&gt;</p>
<p>Correspondent<br />
This field denotes the correspondent of the letter.  It is a subset of foaf:Person as it should denote a real person. (However it is perfectly possible for a fictional letter to be written and in this case it would perhaps be inappropriate to use foaf:Person).</p>
<p>textReferred<br />
This refers to a text (book, verse or similar) which is referred to in the letter being serialised. It is intended to allow the building of graphs between the letters where a text is being referred to so that a graph can be built of what an author was doing or thinking about a text around the time or after writing the text. It is designed to allow for some contextualisation of the referred work. It could also be used to build a reading list, possible influences or forgotten works that the author was aware of at the time.<br />
Work</p>
<p>The term denotes a type of text, in this case a book. It would be a collection of Dublin Core terms.<br />
&lt;text:work rdf:ID=&#8221;http://austgate.co.uk/dickens/work/pickwick&#8221;&gt;<br />
&lt;dc:title&gt;Pickwick Papers&lt;/dc:title&gt;<br />
&lt;dc:author<br />
rdf:resource=&#8221;http://austgate.co.uk/dickens/people/CharlesDickens&#8221;&gt;<br />
&lt;dc:publisher&gt;Chapman and Hall&lt;/dc:publisher&gt;<br />
&lt;/text:work&gt;</p>
<p>I&#8217;m still working on applying some of this to my letters project (which sort of came about because and from the curiosity about the idea). Many thanks to <a title="Brian Matthew's stfc page" href="http://www.e-science.stfc.ac.uk/People/brian_matthews5088.html" target="_blank">Brian Matthews</a> of the <a href="http://e-science.stfc.ac.uk">e-Science</a> department of the STFC but any mistakes or oversights are entirely mine.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/03/creating-the-text-ontology/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mining data driving the web?</title>
		<link>http://austgate.co.uk/2010/03/mining-data-driving-the-web/</link>
		<comments>http://austgate.co.uk/2010/03/mining-data-driving-the-web/#comments</comments>
		<pubDate>Wed, 17 Mar 2010 19:54:30 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Text Mining]]></category>
		<category><![CDATA[data sets]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=128</guid>
		<description><![CDATA[Just seen an article on Techcrunch by Bradford Cross of Flightcaster regarding the growth of data on the Web. He appears to argue that data and its uses will drive the Web soon, writing: the data age is less about the raw size of your data, and more about the cool stuff you can do [...]]]></description>
			<content:encoded><![CDATA[<p>Just seen an article on Techcrunch by Bradford Cross of Flightcaster regarding the <a title="Bradford cross on data" href="http://techcrunch.com/2010/03/16/big-data-freedom/" target="_blank">growth of data</a> on the Web. He appears to argue that data and its uses will drive the Web soon, writing:</p>
<blockquote><p>the data age is less about the raw size of your data, and more about the  cool stuff you can do with it. Now that there is so much data, it is  time to unlock its value.</p></blockquote>
<p>It seems fairly straight forward given the lower barriers to growth and tools to create and access data.</p>
<p>There are issues with this such as learnng how to best leverage these for the user and to gain most benefit. It&#8217;ll certainly be an interesting time and Cross identifies a few technologies and ideas which may or may not gain currency but will spark debate nonetheless.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/03/mining-data-driving-the-web/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
