Cultures of Knowledge – Collaboration, Early Modern Letters Online, and Horizon 2020

Collaboration, Early Modern Letters Online, and Horizon 2020 by Howard Hotson and introduced by Dave de Roure. Notes are unedited.

D de R introducing the space of new scholarship with new technologies and big data. Interested in the engagement of large amounts of people and the social machines (Berners-Lee, Weaving the Web, 1999, p 172-175). People do creative work, machines administrate. “The ability to create new forms of social process would be given to the world at large” -> use is not mandated by the machine but by user.

Seeing social editions in humanities – Crowdmap the cCrusades, social edition of the Devonshire & Transcribe Bentham. EEBO becoming open in January.

Problem: Relating the communications revolutions of now and Early Modern Europe.

Using maps at 100 year intervals to show the spread of postal networks. Increased form of communication leads to a virtual community of scholars to exchange knowledge. Affected by political boundaries. Used the example of Comenius who was displaced by the wars. Letters took 150 years to re-assemble. Linking content with the places but leads to silos of letters & collections.

EMLO as way of looking at social and technical aspects of linking the data and reconstructing the data for the network. First phase: mostly editorial and designing tools and interfaces with 60,000 networks. Second phase is 40000 with mainly digital scholarship. Users can collect.

Phase 3 with 60,000 from 2015 – 2017 to design and build a system collaboratively. Using semi-automation of data standardisation & visualisation. Using existing data to standardise the data itself and clean up anything going in to it. 20000 letters of Dutch Golden Age about to be released. Some very interesting people coming up in next few years, such as Ussher. Holding back the data as it comes in to correctly attribute it. Can use search to build a community rather than just single author. Records created by teams working across institutions. Also shows the images with the digitised version. Celebrate diversity of the data to attract it in the first place and also attribute it correctly. Balance between the editing and swamping editors. Prosopography: two problems: Hartlib and Comenius against the complicated backdrop of politics. Using a web form to encourage contributors to enter the data correctly and link to existing data. Trying to train the system to develop and find the path ways. Using Open Refine, GeoNames, VIAF/SPARQL. Bringing one set of functionality and one set of output. Intend to reduce mechanical Turk aspect by using machines. Phase 3 creating visualisation for the data sets. Joining with the Mapping the Republic of Letters tools and using Palladio to visualise the data. Rather than replicate, work together. COST funding allow EMLO to build a network of scholars: librarians, archivists, IT and scholars. Collaboration is the key. Looking at building standards towards creating and exchanging data between systems. Building Hyvonen‘s work in some part. Linking the letters then build the analytical machinery to mine it.

A need to get ahead of the REF and the metrics by which contributions can be rendered invaluable. Developing a digital journal based on large collections of data though with scholarly standards. Animated visualisations within the journal to link to the data set. (Links to Figshare / tools?).

Ingesting data without clear metadata standards. Enriching with VIAF numbers and standards. Can offer a service to libraries to ingest data and provide an electronic standards again. Maintaining integrity of the original records but providing a standard. Projects are being planned about contributing data and some are reluctant. Takes time.

Topic modelling? Licensing? Using text mining to drive the project? Critical editions to link critical and annotation tools together. Also cross library. Copyright and revenue generation to encourage collections being issues.