<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Aust Gate &#187; weeknotes</title>
	<atom:link href="http://austgate.co.uk/category/weeknotes/feed/" rel="self" type="application/rss+xml" />
	<link>http://austgate.co.uk</link>
	<description>Open Knowledge and Literature</description>
	<lastBuildDate>Tue, 08 May 2012 20:33:34 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Weeknotes: Testing and documentation</title>
		<link>http://austgate.co.uk/2012/03/weeknotes-testing-and-documentation/</link>
		<comments>http://austgate.co.uk/2012/03/weeknotes-testing-and-documentation/#comments</comments>
		<pubDate>Sun, 04 Mar 2012 14:50:17 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[documentation]]></category>
		<category><![CDATA[geolocation]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=485</guid>
		<description><![CDATA[It has been a slightly quiet week but one in which I have been working on the quieter parts of development &#8211; testing and documentation. I was helping out some colleagues by testing some existing code this week. not because I do not have any thing to do but because they are woking on something [...]]]></description>
			<content:encoded><![CDATA[<p>It has been a slightly quiet week but one in which I have been working on the quieter parts of development &#8211; testing and documentation.</p>
<p>I was helping out some colleagues by testing some existing code this week. not because I do not have any thing to do but because they are woking on something large which required some help testing and bug hunting. I worked on the test plan with my manager and extended it when I was looking at other parts of the site. This was the moment when I lost my testing &#8220;cherry&#8221; as it were and fired up the <a title="SelemiumHQ IDE" href="http://seleniumhq.org/projects/ide/" target="_blank">selenium IDE</a> in Firefox.</p>
<p>Using its simple recording and playback facilities, I was able to reconstruct our testing steps and even add to them as I saw that assumptions had been made in the original document.</p>
<p>The simplest way of creating tests is to fire up Selenium and to go to the beginning of the tests that you want to run. Then click the record button and it will watch you click around the site and record a series of steps. When you click record again to stop it running, you can get it to run through all the steps or just one step using the playback buttons.</p>
<p>If you need to add a step, you can use the really easy interface to create or delete the steps and the actions that must take before re-running the tests that must be taken. It does take some time to set up the tests but that just be weighed up against the costs of running tests manually and perhaps missing steps.</p>
<p>Once the tests are complete and running, then you can save the tests using the file option and then create  a folder of them to run as required. Alternatively they can be exported into various formats including jUnit and PHPUnit.</p>
<p>The next step will be to add unit tests into code and try to backport some tests in existing code. Set up time is outweighed by the saved time later. I suppose a future evening project will be to put this together with a Selenium server and pull code from a repository.</p>
<p>The other tasks has been writing documentation for recently completed projects and some ongoing and planned ones. Having written up some user stories for new ideas and also giong back over old code has made me re-evaluate some of the code and to consider it as part of something organic. It is less of a chore now and more a help to myself and colleagues so that, when the paperwork and wikis are up to date, we could in theory have a break and if something happened, one of us can dive in. Alternatively we can make out iterative process better and stronger since we are not trying to rediscover the process of something by following code until we stumble across the answer.</p>
<p>I did have an interesting time with HTML5 geolocation which I need to look into more deeply. Having got some code quickly working to show a position on a map, I found some dead area in Oxford whilst showing somebody else the app. I am assuming that the geolocation runs off masts and the GPS system but it does raise a caveat to the enterprise of using pure HTML5. Need to put the code  onto a page and wander around the town with my phone pne Saturday.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2012/03/weeknotes-testing-and-documentation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Open Correspondence and TextCamp</title>
		<link>http://austgate.co.uk/2012/02/weeknotes-open-correspondence-and-textcamp/</link>
		<comments>http://austgate.co.uk/2012/02/weeknotes-open-correspondence-and-textcamp/#comments</comments>
		<pubDate>Sun, 19 Feb 2012 14:20:02 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[projects]]></category>
		<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[textcamp]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=476</guid>
		<description><![CDATA[It has been a while since I&#8217;ve written a weeknote. Must get back into the habit. Open Correspondence Development on the Open Correspondence project has been slow to stalled for a while. I have been doing bits and pieces but sitting down with Mark McGillivray of Cottage Labs and the Open Knowledge Foundation, brought some [...]]]></description>
			<content:encoded><![CDATA[<p>It has been a while since I&#8217;ve written a weeknote. Must get back into the habit.</p>
<p><span style="text-decoration: underline;">Open Correspondence</span></p>
<p>Development on the <a title="Open Correspondence site" href="http://www.opencorrespondence.org/" target="_blank">Open Correspondence</a> project has been slow to stalled for a while. I have been doing bits and pieces but sitting down with Mark McGillivray of Cottage Labs and the Open Knowledge Foundation, brought some clarity. Recently the <a title="Textus project" href="http://wiki.okfn.org/Projects/Textus" target="_blank">Textus project</a> has been announced and I have been talking with the developers to put the data onto that platform. It seems to me that it is better to pool resources and to contribute where I can. There are parts of the existing project that I like and others that need more work to make me happy and it seems right now to move onto the developing platform.</p>
<p><span style="text-decoration: underline;">Textcamp</span></p>
<p>At Textcamp last September, one of the sessions covered DIY Bookscanners (<a title="Textcamp post" href="http://austgate.co.uk/2011/08/thinking-about-texts-and-communities-at-textcamp/" target="_blank">Austgate post on Textcamp</a>). One of the actions on the Textus wiki was OCRing text. I have posted previously about <a title="Austgate and Tesseract" href="http://austgate.co.uk/2011/11/using-tesseract-with-python-for-ocr/" target="_blank">playing with Tesseract</a> and seeing this, I emailed the humanities-dev list to explore the possibilities. To this end, I have volunteered to work on the area and will write a blog post about it There is already a large amount of work that exists, so  I am perhaps not developing anything new. However it would, I think, be interesting to develop a stand-alone system that is flexible and downloadable. Like other OKF projects, it will be a Python project but also be a hardware project to try and extend some of the existing projects.</p>
<p><span style="text-decoration: underline;">Other Bits</span></p>
<p>I&#8217;ve been working on an indexing project which appears to be coming together quite nicely. Hopefully I&#8217;ll be able to say some more shortly but it depends on a conversation that has yet to be had.</p>
<p>Next week, after a break, is a return to work and to data. The Dev8d conference provided me with some ideas and clarity on one or two things, so time to put them into practice.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2012/02/weeknotes-open-correspondence-and-textcamp/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Documents and data</title>
		<link>http://austgate.co.uk/2011/07/weeknotes-documents-and-data/</link>
		<comments>http://austgate.co.uk/2011/07/weeknotes-documents-and-data/#comments</comments>
		<pubDate>Sun, 03 Jul 2011 14:39:22 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[documents]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[linked_data]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=364</guid>
		<description><![CDATA[The main project this week (apart from hte onging one of moving and virtualising servers) is to begin work on our technical documents. I&#8217;m trying to move them onto the web and make the useful, not only in terms of reading about them but also to make them linkable. I&#8217;m trying to get them out [...]]]></description>
			<content:encoded><![CDATA[<p>The main project this week (apart from hte onging one of moving and virtualising servers) is to begin work on our technical documents.</p>
<p>I&#8217;m trying to move them onto the web and make the useful, not only in terms of reading about them but also to make them linkable. I&#8217;m trying to get them out of being placed on a web site as Word or PDF downloads and move them into being web pages with comments. Drupal 7&#8242;s inbuilt book module is probably the way to go and is producing some really nice results in the hacking I managed on Friday. There is a certain pleasure now in that I began the hack at 8:30 and within an hour, I had a working document (albeit I wanted to mess around with the URLs to make the nicer and far more meaningful). It had comments and was generally felt to be good.</p>
<p>The next task was to work on a way of doing Frequently Asked Questions (FAQs). Having begun some of the work using the <a title="Frequently Asked Questions Drupal module" href="http://drupal.org/project/faq" target="_blank">Frequently Asked Questions module</a>, I decided it had to many issues for us (including not being able to control where the page was and it did not appear to play nicely wiht the <a title="Pathauto Drupal module" href="http://drupal.org/project/pathauto" target="_blank">Pathauto rewriting module</a>), I write my own content type which we can manipulate via the Views module to create sets of FAQs. When I&#8217;ve got more time, I may come back to the module and try to help fix some bugs.</p>
<p>Whilst neither of these are finished items, it was a pleasant day hacking and creating, getting prototypes ready in a day. I&#8217;m taking this as a sign of increasing familiarity with Drupal. I do, however, need to find a morning to finish the Sugar SOAP integration module and tidy that up. Ideally I&#8217;d trying to find a way of integrating it with the current module to offer swapable backends.</p>
<p>I&#8217;ve also started looking at using <a title="Redis website" href="http://redis.io" target="_blank">Redis</a> for caching again in a major way to ensure that various static fields of data, such as UK counties, can have a common reference to reduce data cleaning issues such as county begin written as co., co and county. I&#8217;m also looking at the issue of Linked Data and how to integrate the ideas into our current projects. For now I&#8217;m rereading <a title="Tim Berner's-Lee on Linked Data" href="http://www.w3.org/DesignIssues/LinkedData.html" target="_blank">Tim Berners-Lee&#8217;s guide</a>, linked from the <a title="Linked Data website" href="http://linkeddata.org/" target="_blank">linkeddata.org </a>website and formulating ideas and refining the ones I currently have.</p>
<p>Ambition might bet the better of me but at least I feel like I want to take all of this on and to try to improve skills and learn more. In the meanwhile, I have some serious hills to climb.</p>
<p>Update:  This post has got me rethinking the Open Correspondence RDF and Linked Data. The more I delve, the greater my sense of needing to rethink that part of the project and to complete the correspondence links. Most of them are there but need complete linking. I also need to look at the Python&#8217;s <a title="Python's RDFLib code" href="http://www.rdflib.net/" target="_blank">RDFlib</a> and perhaps make better use of the Sparql qeuries and stores. I sense an evening or several of experimentation before a hacking weekend to resolve these issues.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2011/07/weeknotes-documents-and-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Drupal, NoSQL and data</title>
		<link>http://austgate.co.uk/2011/06/weeknotes-drupal-nosql-and-data/</link>
		<comments>http://austgate.co.uk/2011/06/weeknotes-drupal-nosql-and-data/#comments</comments>
		<pubDate>Sun, 26 Jun 2011 11:02:47 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[weeknotes]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=362</guid>
		<description><![CDATA[It has been an interesting week which  I would rather forget. However I am not and it made me rethink quite a few assumptions. On the plus side, I&#8217;ve managed to write some of the documentation for the portal and map the processes which need to be coded next week. The major thing that I [...]]]></description>
			<content:encoded><![CDATA[<p>It has been an interesting week which  I would rather forget. However I am not and it made me rethink quite a few assumptions. On the plus side, I&#8217;ve managed to write some of the documentation for the portal and map the processes which need to be coded next week.</p>
<p>The major thing that I have completed is the basic integration of Drupal 7 with SugarCRM Community edition. At the moment it definitely works with version 6.12 as this is what I have at the moment but I&#8217;m going to upgrade to 6.2. I do not see any issues regarding this as apart from having to remap one or two fields. I&#8217;m hoping, next week now, to split off some of the changes and to offer them to original project as patches so that the main <a title="Drupal Webform2Sugar project" href="http://drupal.org/project/webform2sugar" target="_blank">webform2sugar</a> project can bring them on board or not as they will.</p>
<p>In tandem with the data cleaning project mentioned last week, I am looking at caching data using <a title="Redis website" href="http://redis.io" target="_blank">Redis</a> behind forms to offer fixed lists of data. Although we commonly use PHP, I am strongly thinking of writing the readers and document parsers in either Perl or Python. What I might do is to write some test scripts in both and benchmark them but also have to balance their handling on MS Word (probably largely the 2003 version rather than the 2007 one) and PDF documents across platforms as I will be moving them across platforms.</p>
<p>I&#8217;ve also been thinking about NoSQL stores for other pieces of data and projects which are being worked on. The <a title="HighScalability on NoSQL use cases" href="http://highscalability.com/blog/2011/6/20/35-use-cases-for-choosing-your-next-nosql-database.html" target="_blank">HighScalability blog</a> has a great piece on what to look under which circumstances for SQL and NoSQL databases.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2011/06/weeknotes-drupal-nosql-and-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>JISCMail to migrate to new platform</title>
		<link>http://austgate.co.uk/2011/06/jiscmail-to-migrate-to-new-platform/</link>
		<comments>http://austgate.co.uk/2011/06/jiscmail-to-migrate-to-new-platform/#comments</comments>
		<pubDate>Sun, 19 Jun 2011 13:25:52 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[jiscmail]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=352</guid>
		<description><![CDATA[I see from Twitter that JISCMail announced that they have funding for another year which is a good thing (in the Sellars and Yeats sense). It does mention that they are migrating onto a new platform (though it is inferred keeping the current mail system) but does not mention what this might be. The statement [...]]]></description>
			<content:encoded><![CDATA[<p>I see from Twitter that <a title="JISCMail funding news" href="http://www.jiscmail.ac.uk/news/2011/june2011.html" target="_blank">JISCMail announced that they have funding</a> for another year which is a good thing (in the Sellars and Yeats sense). It does mention that they are migrating onto a new platform (though it is inferred keeping the current mail system) but does not mention what this might be.</p>
<p>The statement linked to is more than terse but I wait with interest to see what this new platform is and what it will do to &#8220;provide new benefits beyond JISCMail’s traditional boundaries&#8221;. I did hear mention of social networking in passing and integration with social networks was being discussed whilst I was there.</p>
<p>Further announcements to come&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2011/06/jiscmail-to-migrate-to-new-platform/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Storing and cleaning data</title>
		<link>http://austgate.co.uk/2011/06/weeknotes-storing-and-cleaning-data/</link>
		<comments>http://austgate.co.uk/2011/06/weeknotes-storing-and-cleaning-data/#comments</comments>
		<pubDate>Sun, 19 Jun 2011 13:15:13 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[data sets]]></category>
		<category><![CDATA[node]]></category>
		<category><![CDATA[redis]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=350</guid>
		<description><![CDATA[This week has been soft launching a CRM system for the Janet project. Hopefully these would be just user bugs but it has highlighted some interesting data cleaning issues. These are going to be inherent in the exchange of data between two or more systems, especially when one is a long-term pre-existing one. This has [...]]]></description>
			<content:encoded><![CDATA[<p>This week has been soft launching a CRM system for the Janet project. Hopefully these would be just user bugs but it has highlighted some interesting data cleaning issues. These are going to be inherent in the exchange of data between two or more systems, especially when one is a long-term pre-existing one.</p>
<p>This has long-term implications in terms of continuing to ensure that the data is clean and standardised. Given that one of the forthcoming projects is based on our technical documents and converting them from existing formats (when these are fully confirmed) into the , as yet unbuilt or designed, system. As part of this I&#8217;ve been looking at the Chris Gutteridge&#8217;s <a title="Chris Gutteridge's Grinder" href="https://github.com/cgutteridge/Grinder" target="_blank">Grinder,</a> a parser for getting RDF data out of Excel and CSV files. I was reminded of Grinder whilst reading his article about Linked Data at the University of Southampton in the final ever Nodalities. Whilst Grinder itself may not be of initial use, it does give me some clues about the possibilities of transforming the data.</p>
<p>The project also forces me to think about how the programme would run and I suspect off the command line. If this is a safe assumption, then it means that I need to get back to Perl or use Python. Much as I like PHP, I&#8217;m not sure it is a command line language. I know it can be run as one but it always make me nervous as I don&#8217;t really consider it a system administration or data munging language. In either case, Perl and Python mean another re-learning curve, especially Perl which I last use at JISCMail a couple of years ago.</p>
<p>A side project that I&#8217;ve been  looking at is the real-time data storage of feeds for later mining and use. I&#8217;ve been thinking of using Node.js (and actually starting something!) and Redis to run in the background. A little side something, methinks. It does mean me learning more about Node though and gives me something tangible to build. I&#8217;ve been having a little search around the Net and came across an older post by <a title="Marshall Kirkpatrick on Realtime web" href="http://www.nten.org/blog/2009/10/28/ten-useful-examples-realtime-web-action" target="_blank">Marshall Kirkpatrick on the NTEN blog about realtime data</a> whilst reading about <a title="Elegant Code blogon node event loops" href="http://elegantcode.com/2010/11/19/taking-baby-steps-with-node-js-threads-vs-events/" target="_blank">event loops in Node on the Elegant Code</a> blog. Of course, once it is stored, it must be processed to be useful but that is the next step.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2011/06/weeknotes-storing-and-cleaning-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Open Correspondence toolkit and converting XML into JSON</title>
		<link>http://austgate.co.uk/2011/05/weeknotes-open-correspondence-toolkit-and-converting-xml-into-json/</link>
		<comments>http://austgate.co.uk/2011/05/weeknotes-open-correspondence-toolkit-and-converting-xml-into-json/#comments</comments>
		<pubDate>Thu, 26 May 2011 19:25:47 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Open Knowledge]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[Text Mining]]></category>
		<category><![CDATA[weeknotes]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=342</guid>
		<description><![CDATA[I&#8217;ve been quiet for a bit though generally because I&#8217;ve been quite busy on projects and exploring ideas. After Book Hackday, I&#8217;ve written a post about beginning to develop the Open Correspondence toolkit for the Open Knowledge Foundation&#8217;s Notebook blog. I was also contacted regarding converting the TEI XML pages into JSON, which I am [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been quiet for a bit though generally because I&#8217;ve been quite busy on projects and exploring ideas.</p>
<p>After Book Hackday, I&#8217;ve written a post about <a title="Open Correspondence toolkit" href="http://notebook.okfn.org/2011/05/25/mining-the-personal-using-open-correspondence-to-explore-correspondents/" target="_blank">beginning to develop the Open Correspondence toolkit</a> for the Open Knowledge Foundation&#8217;s Notebook blog. I was also contacted regarding converting the TEI XML pages into JSON, which I am currently working on.</p>
<p>Once I&#8217;ve done some more work on it, I&#8217;ll post the code and more about it.</p>
<p>I&#8217;ve been working on another project which may or may not be open. It is certainly interesting but I am not sure I can say much more than that. I hope to have a blog post up soon about it but I am rather excited by it and its possibilities.</p>
<p>Meanwhile, the work project continues apace with some surprising outcomes for me. Following watching a video on Facebook&#8217;s architecture, I&#8217;m beginning to see certain parts very differently. I really do hope more on this but I&#8217;ve got some building to do and a bit more delving and reading that needs completion.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2011/05/weeknotes-open-correspondence-toolkit-and-converting-xml-into-json/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Open Correspondence updates</title>
		<link>http://austgate.co.uk/2011/03/weeknotes-open-correspondence-updates/</link>
		<comments>http://austgate.co.uk/2011/03/weeknotes-open-correspondence-updates/#comments</comments>
		<pubDate>Tue, 08 Mar 2011 10:01:37 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[projects]]></category>
		<category><![CDATA[Text Mining]]></category>
		<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[mapping]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[timelines]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=298</guid>
		<description><![CDATA[I&#8217;ve bitten the bullet and done it. I&#8217;ve uploaded the current changes to the Open Correspondence site. The current changes are: additional fields in the RDF endpoint.  I still need to do some major work to JSON and XML which I hope to do for the next update. a basic text search a basic set [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve bitten the bullet and done it. I&#8217;ve uploaded the current changes to the Open Correspondence site.</p>
<p>The current changes are:</p>
<ul>
<li> additional fields in the RDF endpoint.  I still need to do some major  work to JSON and XML which I hope to do for the next update.</li>
</ul>
<ul>
<li>a basic text search</li>
</ul>
<ul>
<li>a basic set of geographic data in the collection</li>
</ul>
<ul>
<li> better linking from the letters to the correspondent and geographical  data (NB it is still incomplete)</li>
</ul>
<ul>
<li> some mapping with <a title="Open Layers Javascript mapping website" href="http://openlayers.org/" target="_blank">Open Layers</a> javascript.</li>
</ul>
<ul>
<li> a <a title="Simile timeline " href="http://www.simile-widgets.org/timeline/" target="_blank">Simile</a> timeline (which is a bit slow at the moment).</li>
</ul>
<p>Admittedly some of this is exposing work already there but hidden. However I&#8217;ve also been working on some unicode fixes to the underlying XML which is used by the project which has meant rebuilding the tables and the Xapian indexes.</p>
<p>Following a request on the Open Literature mailing list, I&#8217;m looking at the idea of using Python&#8217;s <a title="Python Natural Language Toolkit" href="http://www.nltk.org/" target="_blank">NLTK</a> to create some linguistic API wrappers around the Xapian search. It strikes me that these letters can be used to create a corpus of Dickens&#8217;s language where you can explore the language used in family correspondence (to his daughters and wife), to other authors (Wilkie Collins) and to readers. That is a longer project though in terms of building the relevant indexes.</p>
<p>I&#8217;m also looking at the idea of clustering a collection of letters to a correspondent and seeing what happens (for some reason, the current script is looking at Wilkie Collins). There is also a set of queries that one might run against letters discusing books and the publication dates to view the distribution. I&#8217;m working on these latter questions at the moment for intended release later this week but I do foresee it being delayed a while.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2011/03/weeknotes-open-correspondence-updates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Conferences and Open Correspondence</title>
		<link>http://austgate.co.uk/2011/02/weeknotes-conferences-and-open-correspondence/</link>
		<comments>http://austgate.co.uk/2011/02/weeknotes-conferences-and-open-correspondence/#comments</comments>
		<pubDate>Sun, 20 Feb 2011 15:55:37 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[dev8d]]></category>
		<category><![CDATA[linked_data]]></category>
		<category><![CDATA[mobile_web]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=291</guid>
		<description><![CDATA[On Wednesday I went to the JISC dev8d conference. I wish I could have gone for both days but time doesn&#8217;t permit at the moment. In all, I had a trhough provoking day and managed to catch talks on the Mobile Web (which I wasn&#8217;t expecting) and Linked Data. Whilst I didn&#8217;t attend the programming [...]]]></description>
			<content:encoded><![CDATA[<p>On Wednesday I went to the JISC <a title="JISC funded dev8d conference" href="http://www.dev8d.org/" target="_blank">dev8d</a> conference. I wish I could have gone for both days but time doesn&#8217;t permit at the moment. In all, I had a trhough provoking day and managed to catch talks on the Mobile Web (which I wasn&#8217;t expecting) and Linked Data. Whilst I didn&#8217;t attend the programming workshops on languages such as <a title="Clojure website" href="http://clojure.org/" target="_blank">Clojure</a> or <a title="Erlang site" href="http://www.erlang.org/" target="_blank">Erlang</a> (at the moment I don&#8217;t have a need to use either), I was looking for matters that might be useful for my impending move to <a title="Janet website" href="http://ja.net" target="_blank">Janet</a>. (This is one of the reasons why I haven&#8217;t posted recently &#8211; I was either preparing or convinced I hadn&#8217;t got the job.)</p>
<p>I bumped into <a title="Eamonn Neylon on Twitter" href="http://twitter.com/eneylon" target="_blank">Eamonn Neylon</a> and we went along to the Mobile Web session with Mike Jones from Bristol and the <a title="Molly project: open source mobile portal" href="http://mollyproject.org/" target="_blank">Molly project</a>. They outlined the two main approaches (either via the various app markets or having a front end which caters for the different phones) and issues such as being sandboxed from the hardware layer at the moment. It would seems from them that you need to do both ideally, though development time doesn&#8217;t always allow. The session was slightly hijacked by the Python 2 versus 3 question and if Molly would ever use Python 3 but we gradually got back on track. The main barrier to entry would be the lack of standardisation so you need to target the platforms as well as the hardware issues.</p>
<p>I stayed for the Linked Data session which <a title="Chris Gutteridge's page at ECS" href="http://www.ecs.soton.ac.uk/people/cjg" target="_blank">Chris Gutteridge</a> took sort of control from the array of speakers. The main focus became notions of openness (as defined in the Open Knowledge Definition) and how it is perceived by the academic community bringing back up the ideas of attribution on the web. (An issue which is partly cultural.) The issue of clear licencing came up again as well but there does seems to be some clarification needed on the different models.</p>
<p>I did go to some of the lightning talks before lunch but they didn&#8217;t leave much of an impression this time (though Chris Gutteridge did plug his <a title="Q&amp;D RDF browser" href="http://graphite.ecs.soton.ac.uk/browser/" target="_blank">Q &amp;D RDF Browser</a> which I&#8217;m thinking of using). After an excellent lunch, I wandered into basecamp where I plugged in my laptop and worked on <a title="Open Correspondence" href="http://www.opencorrespondence.org" target="_blank">Open Correspondence</a> whilst waiting for the session on the Linked Data API. I did spend a couple of hours work on it to fix some bugs and little things for the next version to go live (though discovered another one in places with some missing but it is not huge, just needs a couple of hours). <a title="Rufus Pollock's site" href="http://rufuspollock.org" target="_blank">Rufus Pollock</a> and <a title="Jo Walsh's site" href="http://frot.org/" target="_blank">Jo Walsh</a> popped by so we managed to catch up and do some hacking. Rufus suggested using <a title="Python's flask " href="http://flask.pocoo.org/" target="_blank">Flask</a> which I think I&#8217;ll use for some smaller projects in the future (and for some reason <a title="Backbone.js Github" href="http://documentcloud.github.com/backbone/" target="_blank">Backbone</a> was mentioned but not sure how).</p>
<p>I went along to <a title="epimorphics site" href="http://www.epimorphics.com/web/" target="_blank">Chris Dollin</a>&#8216;s talk on his <a title="ELDA implementation of Linked Data" href="http://elda.googlecode.com/hg/deliver-elda/src/main/docs/index.html" target="_blank">eLDA</a> library and how the Linked Data API works. It seems like an eminently sensible solution to removing the complexity of semantic technologies from the user and to make it easier to use. It is certainly something which will be useful to get my head around completely.</p>
<p>The day, as suggested by <a title="Devcsi page" href="http://devcsi.ukoln.ac.uk/" target="_blank">Mahendra Mahey</a>, was definitely more useful when doing something and just cracking on with it. We need more days like this as it provides a collegiate atmosphere to try new things and take a look at different technologies which might not appear on the normal radar. The friendly atmosphere was great as well. I&#8217;ll book both days off if it comes around next year.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2011/02/weeknotes-conferences-and-open-correspondence/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Places in Open Correspondence</title>
		<link>http://austgate.co.uk/2011/02/weeknotes-places-in-open-correspondence/</link>
		<comments>http://austgate.co.uk/2011/02/weeknotes-places-in-open-correspondence/#comments</comments>
		<pubDate>Sun, 06 Feb 2011 13:25:55 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[projects]]></category>
		<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[place_names]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=288</guid>
		<description><![CDATA[I&#8217;ve been doing some work to Open Correspondence over the last couple of weeks. I started re-parsing the letters to expose some more metadata, mainly placenames and to normalise them. I&#8217;ve finally done the first pass of this update which I&#8217;m hoping to make live soon once I&#8217;ve updated the controllers and re-checked the other [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been doing some work to Open Correspondence over the last couple of weeks. I started re-parsing the letters to expose some more metadata, mainly placenames and to normalise them.</p>
<p>I&#8217;ve finally done the first pass of this update which I&#8217;m hoping to make live soon once I&#8217;ve updated the controllers and re-checked the other data improvements. Whilst it is not perfect, it is a lot better than it was. I think that the next week will be spent going over the endpoints and the Pylons controllers so that the data is cleaner than at present and correctly linked.</p>
<p>It has been a useful exercise in that I&#8217;ve started rewriting the parser for the letters (an ongoing large job I was thinking of doing when I come to the next set of letters) and putting some of the earlier thoughts into place.</p>
<p>Once I&#8217;m happy with these updates, I&#8217;ll update the site which does mean rebuilding the databases and endpoints. However once it is done, it should be a lot cleaner  and I can then start looking at the correspondents and linking into other data sources like dbpedia.org. I think that the first task though might be to restart work on the clients that I had been putting together  as a basic development kit.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2011/02/weeknotes-places-in-open-correspondence/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

