Extracting Music Streams from Printouts

I have gone back to work on the Eric Sunderland archive. I also sent in a poster abstract to DMRN + 20 (Digital Music Research Network) with initial comments. It will become part of a talk to be given next year.

A focus for this trip was to take some more photos of the printouts with music and other data. I will be OCRing (Optical Character Recognition) these using Python’s bindings to openCV and tesseract on the command line. Initial results have been a bit interesting, but more practice might help.

I did use the OCR option on a Samsung phone as a test on a couple of pure text images. The results are interesting. I do need to check the other pipeline and see how the data is processed. The mixed images on computing paper seem to be causing some issues, so the Python pipeline might be more useful here. I suspect that this was not the intended use for the tool, but I wanted to explore the option.

The aim is to be able to process the data computationally using Natural Language Parsing to augment the OCRed data.

The Aust Gate

Extracting Music Streams from Printouts

No Comments

Leave a Reply

Elsewhere on the web

Categories

Archives

Search

Open Knowledge

RSS Feeds

Meta