<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Aust Gate &#187; charles dickens</title>
	<atom:link href="http://austgate.co.uk/tags/charles-dickens/feed/" rel="self" type="application/rss+xml" />
	<link>http://austgate.co.uk</link>
	<description>Open Knowledge and Literature</description>
	<lastBuildDate>Tue, 08 May 2012 20:33:34 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Trying to geolocate a part of Charles Dickens</title>
		<link>http://austgate.co.uk/2012/02/trying-to-geolocate-a-part-of-charles-dickens/</link>
		<comments>http://austgate.co.uk/2012/02/trying-to-geolocate-a-part-of-charles-dickens/#comments</comments>
		<pubDate>Mon, 20 Feb 2012 19:45:32 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[charles dickens]]></category>
		<category><![CDATA[geolocation]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=482</guid>
		<description><![CDATA[I have been working at a snail&#8217;s pace on some geolocation queries on some locations which are associated with Charles Dickens. Using Python&#8217;s NLTK library, I managed to extract around 60 distinct locations from his novel, Bleak House. A bit of human editing has tidied this up for me but it looks useful. Having popped [...]]]></description>
			<content:encoded><![CDATA[<p>I have been working at a snail&#8217;s pace on some geolocation queries on some locations which are associated with Charles Dickens.</p>
<p>Using Python&#8217;s NLTK library, I managed to extract around 60 distinct locations from his novel, Bleak House. A bit of human editing has tidied this up for me but it looks useful.</p>
<p>Having popped this into MySQL as I am more familiar with it than PostgreSQL and returned it using Open Layers, I&#8217;ve been looking at ways of having a query which takes the user location (once given permission) and then calculates locations that around a mile away. Or within walking distance. A bit of searching around led me to this <a title="Stackoverflow, GPS and geolocation" href="http://stackoverflow.com/questions/3349808/php-mysql-get-locations-in-radius-users-location-from-gps" target="_blank">Stackoverflow question on finding distances within a certain radius </a>from a given location. The post is also helpful in pointing me towards some of the maths so that when I have some time, I can try to understand the underlying query.</p>
<p>Anyhow, HTML5 to some degree has come to the rescue with its geolocation API. I know it is not a standard at the moment (the <a title="W3 geolocation draft" href="http://dev.w3.org/geo/api/spec-source.html" target="_blank">draft I saw was 28th June, 2011</a>), but I thought it would be fun to use it to start doing some mapping with the API, courtesy of some guidance from <a title="HTML5 Doctor on geolocation" href="http://html5doctor.com/finding-your-position-with-geolocation/" target="_blank">HTML5Doctor</a> and <a title="HTML5 demos on geolocation" href="http://html5demos.com/geo" target="_blank">HTML5demos</a> site. It also helps me to re-use some of the data collected for the <a title="Open Correspondence site" href="http://www.opencorrespondence.org/" target="_blank">Open Correspondence</a> project and to bring locations related to Dickens together and make them useful.</p>
<p>What would be interesting to see is the locations surrounding a given latitude and longitude within one mile so that the user could walk to them or find out about them and query them to see if they really want to go to that location.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2012/02/trying-to-geolocate-a-part-of-charles-dickens/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Weeknotes: Open Correspondence, Xapian and Linked Data</title>
		<link>http://austgate.co.uk/2010/11/weeknotes-open-correspondence-xapian-and-linked-data/</link>
		<comments>http://austgate.co.uk/2010/11/weeknotes-open-correspondence-xapian-and-linked-data/#comments</comments>
		<pubDate>Sun, 07 Nov 2010 10:58:20 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[charles dickens]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[xapian]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=233</guid>
		<description><![CDATA[After last week&#8217;s server move, we discovered one or two things that needed to be changed before they could go live. The main thing was the Xapian search which I had been working on. The initial version kept the Xapian server on the local machine and used that to index and search the letters butt [...]]]></description>
			<content:encoded><![CDATA[<p>After last week&#8217;s server move, we discovered one or two things that needed to be changed before they could go live. The main thing was the Xapian search which I had been working on. The initial version kept the Xapian server on the local machine and used that to index and search the letters butt he new version is distributed across machines so it required a brief change.</p>
<p>Opening a &#8220;one box wonder&#8221; Xapian search in Python is done via:</p>
<blockquote><p>xapian.WritableDatabase(db_path, xapian.DB_CREATE_OR_OPEN)</p></blockquote>
<p>where db_path is the database name that you want to give the index and open the index using:</p>
<blockquote><p>xapian.Database(db_path)</p></blockquote>
<p>Since the project uses Pylons, the controller used a path out to the .ini file loaded at runtime to link to the correct database.</p>
<p>Using the documentation on the <a title="Xapian Documentation on remote backends" href="http://xapian.org/docs/remote.html" target="_blank">Xapian site for remote backends</a> and the<a title="Xapian Python bindings documentation" href="http://xapian.org/docs/bindings/python/" target="_blank"> Python bindings</a>, I was able to quickly adjust the code so that xapian.WritableDatabase is replaced by:</p>
<blockquote><p>xapian.remote_open_writable(&#8220;&lt;host name&gt;&#8221;, &#8220;&lt;port number&gt;&#8221;)</p></blockquote>
<p>and is opened by:</p>
<blockquote><p>xapian.remote_open(&#8220;&lt;host name&gt;&#8221;, &#8220;&lt;port number&gt;&#8221;)</p></blockquote>
<p>Once that is set up, then all you need to do is to start the the TCP server which is what I&#8217;ve been looking at. I downloaded the tar.gz file of Xapian-core from the Xapian site, configured and made on Ubuntu Lucid Lynx and then ran xapian-tcpsrv &#8211;port &lt;port number&gt; &lt;database name&gt; in a new terminal window which allowed me to test the connections and get them ready for going live.</p>
<p>Changes are afoot on the Open Correspondence site as well. As part of a conversation that involved Keith Alexander, of <a title="Talis Platform" href="http://www.talis.com/platform" target="_blank">Talis</a>, the project is going to evolve into a slightly more Linked Data direction with references to the books, magazines, correspondents and so on. I&#8217;d already started going in this direction with the correspondent links (such as <a title="Georgina Hogarth correspondent link on Open Correspondence" href="http://www.opencorrespondence.org/letters/correspondent/Miss%20Hogarth" target="_blank">http://www.opencorrespondence.org/letters/correspondent/Miss%20Hogarth</a>) so this is really an extension of where we need to go to connect to other resources such  as Dbpedia, Wikipedia and so on. The fact that it is <a title="Dickens 2012 website" href="http://www.dickens2012.org" target="_blank">Dickens&#8217;s bi-centenary in 2012</a> gives an added boost to the project. The Linked Data approach gives us the chance of creating some sort of framework for future expansion and linking together of data sources, not only at a literary level but also socially. It also encourages me to sort out the content negotiation work that was started and to try and follow the FAQs that the <a title="Pedantic Web group site" href="http://pedantic-web.org/" target="_blank">Pedantic Web</a> group have posted to make sure that the site follows the best standards that it can and to build them into future developments and directions.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/11/weeknotes-open-correspondence-xapian-and-linked-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Exporting and querying Dickens data</title>
		<link>http://austgate.co.uk/2010/03/exporting-data/</link>
		<comments>http://austgate.co.uk/2010/03/exporting-data/#comments</comments>
		<pubDate>Sun, 21 Mar 2010 12:15:35 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[charles dickens]]></category>
		<category><![CDATA[rdf]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=137</guid>
		<description><![CDATA[As a follow up to the posting regarding the propsed ontology, I&#8217;ve started to try and create a SPARQL endpoint. At some point soon, I want to use the new version of ARC as the version I&#8217;ve got here is a little out of date. After that the next thing should be to allow the [...]]]></description>
			<content:encoded><![CDATA[<p>As a follow up to the posting regarding the propsed ontology, I&#8217;ve started to try and create a <a title="Dickens SPARQL endpoint" href="http://austgate.co.uk/dickens/export.php?type=rdf&amp;author=Dickens" target="_blank">SPARQL endpoint</a>. At some point soon, I want to use the new version of <a title="ARC website" href="http://arc.semsol.org/" target="_blank">ARC</a> as the version I&#8217;ve got here is a little out of date. After that the next thing should be to allow the endpoint to be converted into other forms like JSON.</p>
<p>UPDATE: I&#8217;ve created an endpoint using the default ARC settings here: <a title="RDF endpoint for Dickens project" href="http://austgate.co.uk/dickens/endpoint.php" target="_blank">http://austgate.co.uk/dickens/endpoint.php</a></p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/03/exporting-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Letters of Charles Dickens website</title>
		<link>http://austgate.co.uk/2009/09/letters-of-charles-dickens-website/</link>
		<comments>http://austgate.co.uk/2009/09/letters-of-charles-dickens-website/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 20:39:31 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Text Mining]]></category>
		<category><![CDATA[charles dickens]]></category>
		<category><![CDATA[letters]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=94</guid>
		<description><![CDATA[I&#8217;ve finally posted the first draft of the Dickens website here: http://austgate.co.uk/dickens/index.php?author=Dickens.  The idea is that it will allow users to derive networks across the a variety of Victorian authors as and when I can develop the datasets. I&#8217;ve also been developing a small text ontology to add to the Friend of a Friend (FOAF)  [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve finally posted the first draft of the Dickens website here: <a title="Letters of Dickens website" href="http://austgate.co.uk/dickens/index.php?author=Dickens" target="_blank">http://austgate.co.uk/dickens/index.php?author=Dickens</a>.  The idea is that it will allow users to derive networks across the a variety of Victorian authors as and when I can develop the datasets.</p>
<p>I&#8217;ve also been developing a small text ontology to add to the <a title="Friend of a Friend project" href="http://www.foaf-project.org/" target="_blank">Friend of a Friend</a> (FOAF)  and <a title="Dublin Core ontology" href="http://dublincore.org/" target="_blank">Dublin Core </a>(DC) ontologies. I&#8217;ll post details later. The database schema is still under development but I hope to get that change done soon so that I can get on with the XML changes.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2009/09/letters-of-charles-dickens-website/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mining the Letters of Charles Dickens</title>
		<link>http://austgate.co.uk/2009/07/mining-the-letters-of-charles-dickens/</link>
		<comments>http://austgate.co.uk/2009/07/mining-the-letters-of-charles-dickens/#comments</comments>
		<pubDate>Tue, 14 Jul 2009 07:41:13 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Text Mining]]></category>
		<category><![CDATA[charles dickens]]></category>
		<category><![CDATA[simile]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=81</guid>
		<description><![CDATA[As an aside I&#8217;ve started  a small project to begin visualising ways of searching the letters of Charles Dickens and exploring the Simile library which MIT have produced. Its originally an extension to the D-Space repository tool but Rufus Pollock used in the Open Knowledge Foundation&#8217;s Weaving History project &#8211; to which I contributed the [...]]]></description>
			<content:encoded><![CDATA[<p>As an aside I&#8217;ve started  a small project to begin visualising ways of searching the letters of Charles Dickens and exploring the <a title="Simile project page at MIT" href="http://simile.mit.edu/" target="_blank">Simile</a> library which MIT have produced.</p>
<p>Its originally an extension to the D-Space repository tool but Rufus Pollock used in the Open Knowledge Foundation&#8217;s <a title="Microfacts website" href="http://www.microfacts.org" target="_blank">Weaving History</a> project &#8211; to which I contributed the <a title="Milton threads on Microfacts" href="http://www.microfacts.org/thread/read/831cf372-1d28-4c98-ab55-c19899fa3840" target="_blank">Milton</a> json data file. Originally I&#8217;d used it just for biographical timelines but thinking about it, I wondered how you could use it to mine datasets like the letters of Charles Dickens.</p>
<p>Dickens was a prolific letter writer (the Pilgrim edition extends to 12 thick volumes). I don&#8217;t have access to that data but I did download the first volume (of three) that his daughters edited.</p>
<p>Using Perl, I have extracted the date and recipient tags and converted the text file into JSON (as part of a larger process of converting the file into XML and using XSL to transform the data) and then created a table view of the data so that you can easily find the dates of the letters sent to certain people in <a title="Letters of Dickens project" href="/development/dickensletter.php" target="_blank">tabular form</a>.</p>
<p>I&#8217;ve also used the same data set to produce a fairly <a title="Timeline of Dickens' letters" href="http://www.austgate.myzen.co.uk/development/timeline.php" target="_blank">basic timeline of the letters</a> which is being rewritten from here. It needs some rewriting to update to the new version of timeline.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2009/07/mining-the-letters-of-charles-dickens/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

