Archive for June, 2010

Weeknotes: All quiet on the accounting front

Sunday, June 27th, 2010

It’s been a week of relative frustration with priorities suddenly being shifted and the infrastructure road map looking more and more unclear.

The soap server is largely debugged and ready for more extensive testing on the server and the back end has now been rewritten to capture more data. I cannot help feeling that it will change once more services go online to scale more efficiently but right now I don’t have the expertise to do it. I’ll get there.

On a different tack, I’m back on the accounting project that I was on several months ago and making some headway in that. Its grown since I was last involved in it but nothing that a decent set of specs and roadmaps cannot solve in terms of making it manageable.

I’ve been thinking about my next book project which is on the New Weird and genre over the last 15 years and wondering how to use dbpedia’s influencedBy and influence terms in terms of showing how writers influence each other over a century. I’m tempted to put the data into a large rdf sheet and then use javascript or PHP to transform it into JSON to see if you can use the Simile timeline software usefully or if I need to find / write something more appropriate. It does have to wait for me to finish the current book.

I forgot to link to the Open Correspondence blog post on the Open Knowledge Foundation’s blog which was posted a few days ago.

Weeknotes: PHP, SOAP, and Open Letters

Sunday, June 20th, 2010

It has been a fairly quiet week with the boss away. I’ve managed to complete a service to upload details from spreadsheets sent via email.

I’ve also managed to complete a SOAP service in PHP to listen for status updates and just doing the final tests to it now. Once its up it can be repurposed for other companies. One of the things that I think  will come up is how to store XML files most efficiently as MySQL 5 appears to be tied to uploading files rather than just taking POST strings. I’m thinking of using something like Oracle’s BDB XML database (though the license appears to preclude our uses) or eXist but that is something to come back to much later.

I’ve been thinking about the Open Correspondence site and the best way to allow it to be extended by other people. I think that the best way forward to create an internal XML format which the load command can use and anybody can use to create their own files and databases. Its along the lines of the stuff I partially did some work on in the Open Shakespeare project.

Given the boss is away, time for finishing more things off next week. I’ve also created a Trac instance for internal purposes but I think it’ll help on that bane if developing live – documentation.

Weeknotes: Data, service buses and trac

Sunday, June 13th, 2010

I’ve succumbed and I’ve got a microslot at the next Oxford Geek Nights where I’m talking about the Open Correspondence website. I’ve downloaded the rest of the Gutenberg copies of the Dickens letters but just need an evening to make some headway with transforming them.

I spent a fair amout of this week trying to get a status update server built using SOAP and PHP which has been an ‘interesting’ task but seems to have finally got there. Having done some debugging at home on Friday, I’ve got to test the whole thing on Monday on the test server.

I’ve also  been debugging the csv uploads into the database and refactoring the code so that there is more re-use of similar objects. On top of that I started the documentation for the services and realised that I’d written most of the upload service for invoices as well. Bonus… So all I need do really is to spend a couple of days finishing  things of at work so that the first versions of the services can go out.

Whilst doing all of that though, I realised that the queueing system that I was working on was only part of a solution to get all of our services working together. Instead of just queuing, I need to start thinking more along the lines of an enterprise service bus. So that’ll keep me busy then for a couple of weeks. My notebook has various notes and doodles, much to my boss’s enjoyment who thinks its all old-fashioned.

I’ve also started putting together a Trac instance for work to see if it scales and helps with ticketing and information acorss our department’s groups. It’ll probably be sidelined for this week whilst I try and get everything put together again with regards to the data uploads.

Weeknotes: Redis, PHP, mail and SOAP

Sunday, June 6th, 2010

I’ve spent some time writing a queueing library using Redis as a backend. I started with the notion that it would need to be a FIFO queue but didn’t want to only use the in-built parts of PHP as a stack using array_pop or array_push. Whilst it might be faster, it doesn’t allow for queue storage if the worker / router calling the queue does not run until a certain time so I looked at Redis. I  drew some inspiration from MEMQ, a queue implementation using memcached. I wrote a quick set of functions to handle connection, enqueuing and dequeueing with the ever present Rediska as the underlying Redis connection library. I’m tempted to revisit this and to write my own connection to remove the reliance on Rediska. What I did learn was how to increase and decrease the number of items that could be dequeued. For some stupid reason, I’d got into my head that it would either by one or all items.

However if you think about the LLEN command, you can pop as many items as you want, drop them into an array and iterate across them. I need to try this but you could feasibly call items from the middle of the array by changing the start and end points in LLEN. Normally I’d do something like  <list name> LLEN 0, -1 for all items or <list name> LLEN 0, 2 for the first two but if you change 0 to something else where you know there are 30 items but only want 5 from position 20 then you could pop in LLEN 20, 5 to achieve the result. It is not really germaine to the queueing that I’ve been looking at (for system updates where I need everything or just the first item) but could be a useful adaptation for somebody else.

The main challenge this week has been reading Excel attachments from email. PHP’s imap library  allows you to read the structure of an email but is curiously reticent in retrieving data if you have mime parts. I spent ethe best part of a day and a half getting a script to iterate over an incoming email, filter the parts so that it just explored the attachments mime type and then retrive any attachments either from a flat structure or iterating over each part before calling imap_fetchbody(). So far the fix appears to work and has allowed me to create a prototype mail service for receiving email data. It seems odd that in the era of web services that financial data is still sent by insecure methods but we must accomodate.

I’ve also been looking at PHP’s SOAP library to create a status update service which will probably utilise Service Orientated Architecture to create a stable, scalable service. Initially I created a WSDL file using the Eclipse IDE but that threw all sorts of issues and ended up using Zend’s WSDL generator tool running across the existing server. Must look into this but there might be a conflict in versions of WSDL as well as first time learning curve. I’m hoping to get the first version of the service up this week.

I suspect that this week is going to complete the commission and service status services as well as possibly doing some documentation as it is beginning to pile up.