<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Aust Gate &#187; charles dickens</title>
	<atom:link href="http://austgate.co.uk/tags/charles-dickens/feed/" rel="self" type="application/rss+xml" />
	<link>http://austgate.co.uk</link>
	<description>Open Knowledge and Literature</description>
	<lastBuildDate>Mon, 23 Jan 2012 18:10:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Weeknotes: Open Correspondence, Xapian and Linked Data</title>
		<link>http://austgate.co.uk/2010/11/weeknotes-open-correspondence-xapian-and-linked-data/</link>
		<comments>http://austgate.co.uk/2010/11/weeknotes-open-correspondence-xapian-and-linked-data/#comments</comments>
		<pubDate>Sun, 07 Nov 2010 10:58:20 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[weeknotes]]></category>
		<category><![CDATA[charles dickens]]></category>
		<category><![CDATA[open_correspondence]]></category>
		<category><![CDATA[xapian]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=233</guid>
		<description><![CDATA[After last week&#8217;s server move, we discovered one or two things that needed to be changed before they could go live. The main thing was the Xapian search which I had been working on. The initial version kept the Xapian server on the local machine and used that to index and search the letters butt [...]]]></description>
			<content:encoded><![CDATA[<p>After last week&#8217;s server move, we discovered one or two things that needed to be changed before they could go live. The main thing was the Xapian search which I had been working on. The initial version kept the Xapian server on the local machine and used that to index and search the letters butt he new version is distributed across machines so it required a brief change.</p>
<p>Opening a &#8220;one box wonder&#8221; Xapian search in Python is done via:</p>
<blockquote><p>xapian.WritableDatabase(db_path, xapian.DB_CREATE_OR_OPEN)</p></blockquote>
<p>where db_path is the database name that you want to give the index and open the index using:</p>
<blockquote><p>xapian.Database(db_path)</p></blockquote>
<p>Since the project uses Pylons, the controller used a path out to the .ini file loaded at runtime to link to the correct database.</p>
<p>Using the documentation on the <a title="Xapian Documentation on remote backends" href="http://xapian.org/docs/remote.html" target="_blank">Xapian site for remote backends</a> and the<a title="Xapian Python bindings documentation" href="http://xapian.org/docs/bindings/python/" target="_blank"> Python bindings</a>, I was able to quickly adjust the code so that xapian.WritableDatabase is replaced by:</p>
<blockquote><p>xapian.remote_open_writable(&#8220;&lt;host name&gt;&#8221;, &#8220;&lt;port number&gt;&#8221;)</p></blockquote>
<p>and is opened by:</p>
<blockquote><p>xapian.remote_open(&#8220;&lt;host name&gt;&#8221;, &#8220;&lt;port number&gt;&#8221;)</p></blockquote>
<p>Once that is set up, then all you need to do is to start the the TCP server which is what I&#8217;ve been looking at. I downloaded the tar.gz file of Xapian-core from the Xapian site, configured and made on Ubuntu Lucid Lynx and then ran xapian-tcpsrv &#8211;port &lt;port number&gt; &lt;database name&gt; in a new terminal window which allowed me to test the connections and get them ready for going live.</p>
<p>Changes are afoot on the Open Correspondence site as well. As part of a conversation that involved Keith Alexander, of <a title="Talis Platform" href="http://www.talis.com/platform" target="_blank">Talis</a>, the project is going to evolve into a slightly more Linked Data direction with references to the books, magazines, correspondents and so on. I&#8217;d already started going in this direction with the correspondent links (such as <a title="Georgina Hogarth correspondent link on Open Correspondence" href="http://www.opencorrespondence.org/letters/correspondent/Miss%20Hogarth" target="_blank">http://www.opencorrespondence.org/letters/correspondent/Miss%20Hogarth</a>) so this is really an extension of where we need to go to connect to other resources such  as Dbpedia, Wikipedia and so on. The fact that it is <a title="Dickens 2012 website" href="http://www.dickens2012.org" target="_blank">Dickens&#8217;s bi-centenary in 2012</a> gives an added boost to the project. The Linked Data approach gives us the chance of creating some sort of framework for future expansion and linking together of data sources, not only at a literary level but also socially. It also encourages me to sort out the content negotiation work that was started and to try and follow the FAQs that the <a title="Pedantic Web group site" href="http://pedantic-web.org/" target="_blank">Pedantic Web</a> group have posted to make sure that the site follows the best standards that it can and to build them into future developments and directions.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/11/weeknotes-open-correspondence-xapian-and-linked-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Exporting and querying Dickens data</title>
		<link>http://austgate.co.uk/2010/03/exporting-data/</link>
		<comments>http://austgate.co.uk/2010/03/exporting-data/#comments</comments>
		<pubDate>Sun, 21 Mar 2010 12:15:35 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[charles dickens]]></category>
		<category><![CDATA[rdf]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=137</guid>
		<description><![CDATA[As a follow up to the posting regarding the propsed ontology, I&#8217;ve started to try and create a SPARQL endpoint. At some point soon, I want to use the new version of ARC as the version I&#8217;ve got here is a little out of date. After that the next thing should be to allow the [...]]]></description>
			<content:encoded><![CDATA[<p>As a follow up to the posting regarding the propsed ontology, I&#8217;ve started to try and create a <a title="Dickens SPARQL endpoint" href="http://austgate.co.uk/dickens/export.php?type=rdf&amp;author=Dickens" target="_blank">SPARQL endpoint</a>. At some point soon, I want to use the new version of <a title="ARC website" href="http://arc.semsol.org/" target="_blank">ARC</a> as the version I&#8217;ve got here is a little out of date. After that the next thing should be to allow the endpoint to be converted into other forms like JSON.</p>
<p>UPDATE: I&#8217;ve created an endpoint using the default ARC settings here: <a title="RDF endpoint for Dickens project" href="http://austgate.co.uk/dickens/endpoint.php" target="_blank">http://austgate.co.uk/dickens/endpoint.php</a></p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2010/03/exporting-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Letters of Charles Dickens website</title>
		<link>http://austgate.co.uk/2009/09/letters-of-charles-dickens-website/</link>
		<comments>http://austgate.co.uk/2009/09/letters-of-charles-dickens-website/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 20:39:31 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Text Mining]]></category>
		<category><![CDATA[charles dickens]]></category>
		<category><![CDATA[letters]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=94</guid>
		<description><![CDATA[I&#8217;ve finally posted the first draft of the Dickens website here: http://austgate.co.uk/dickens/index.php?author=Dickens.  The idea is that it will allow users to derive networks across the a variety of Victorian authors as and when I can develop the datasets. I&#8217;ve also been developing a small text ontology to add to the Friend of a Friend (FOAF)  [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve finally posted the first draft of the Dickens website here: <a title="Letters of Dickens website" href="http://austgate.co.uk/dickens/index.php?author=Dickens" target="_blank">http://austgate.co.uk/dickens/index.php?author=Dickens</a>.  The idea is that it will allow users to derive networks across the a variety of Victorian authors as and when I can develop the datasets.</p>
<p>I&#8217;ve also been developing a small text ontology to add to the <a title="Friend of a Friend project" href="http://www.foaf-project.org/" target="_blank">Friend of a Friend</a> (FOAF)  and <a title="Dublin Core ontology" href="http://dublincore.org/" target="_blank">Dublin Core </a>(DC) ontologies. I&#8217;ll post details later. The database schema is still under development but I hope to get that change done soon so that I can get on with the XML changes.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2009/09/letters-of-charles-dickens-website/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mining the Letters of Charles Dickens</title>
		<link>http://austgate.co.uk/2009/07/mining-the-letters-of-charles-dickens/</link>
		<comments>http://austgate.co.uk/2009/07/mining-the-letters-of-charles-dickens/#comments</comments>
		<pubDate>Tue, 14 Jul 2009 07:41:13 +0000</pubDate>
		<dc:creator>iain_emsley</dc:creator>
				<category><![CDATA[Information Retrieval]]></category>
		<category><![CDATA[Text Mining]]></category>
		<category><![CDATA[charles dickens]]></category>
		<category><![CDATA[simile]]></category>

		<guid isPermaLink="false">http://austgate.co.uk/?p=81</guid>
		<description><![CDATA[As an aside I&#8217;ve started  a small project to begin visualising ways of searching the letters of Charles Dickens and exploring the Simile library which MIT have produced. Its originally an extension to the D-Space repository tool but Rufus Pollock used in the Open Knowledge Foundation&#8217;s Weaving History project &#8211; to which I contributed the [...]]]></description>
			<content:encoded><![CDATA[<p>As an aside I&#8217;ve started  a small project to begin visualising ways of searching the letters of Charles Dickens and exploring the <a title="Simile project page at MIT" href="http://simile.mit.edu/" target="_blank">Simile</a> library which MIT have produced.</p>
<p>Its originally an extension to the D-Space repository tool but Rufus Pollock used in the Open Knowledge Foundation&#8217;s <a title="Microfacts website" href="http://www.microfacts.org" target="_blank">Weaving History</a> project &#8211; to which I contributed the <a title="Milton threads on Microfacts" href="http://www.microfacts.org/thread/read/831cf372-1d28-4c98-ab55-c19899fa3840" target="_blank">Milton</a> json data file. Originally I&#8217;d used it just for biographical timelines but thinking about it, I wondered how you could use it to mine datasets like the letters of Charles Dickens.</p>
<p>Dickens was a prolific letter writer (the Pilgrim edition extends to 12 thick volumes). I don&#8217;t have access to that data but I did download the first volume (of three) that his daughters edited.</p>
<p>Using Perl, I have extracted the date and recipient tags and converted the text file into JSON (as part of a larger process of converting the file into XML and using XSL to transform the data) and then created a table view of the data so that you can easily find the dates of the letters sent to certain people in <a title="Letters of Dickens project" href="/development/dickensletter.php" target="_blank">tabular form</a>.</p>
<p>I&#8217;ve also used the same data set to produce a fairly <a title="Timeline of Dickens' letters" href="http://www.austgate.myzen.co.uk/development/timeline.php" target="_blank">basic timeline of the letters</a> which is being rewritten from here. It needs some rewriting to update to the new version of timeline.</p>
]]></content:encoded>
			<wfw:commentRss>http://austgate.co.uk/2009/07/mining-the-letters-of-charles-dickens/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

