Last week on Friday, 17th April, Kleanthi Georgala visited AKSW and gave a talk entitled “Traces Through Time: Probabilistic Record Linkage – Medieval and Early Modern”. More information below.
This innovative, multi-disciplinary project will deliver practical analytical tools to support large-scale exploration of big historical datasets. The project aims to bring together international research experience in the digital humanities, natural language processing, information science, data mining and linked data, with large, complex and diverse ‘big data’ spanning over 500 years of British history.
The project’s technical outputs will be a methodology and supporting toolkit that identify individuals within and across historical datasets, allowing people to be traced through the records and enabling their stories to emerge from the data. The tools will handle the ‘fuzzy’ nature of historical data, including aliases, incomplete information, spelling variations and the errors that are inevitably encountered in official records. The toolkit will be open and configurable, offering the flexibility to formulate and ask interesting questions of the data, exploring it in ways that were not imagined when the records were created. The open approach will create opportunities for further enhancement or re-use and offers the further potential to deliver the outputs as a service, extensible to new datasets as these become available. This brings the vision of finding and linking individuals in new combinations of datasets, from the widest range of historical sources.
Traces Through Time project was a collaboration between The National Archives in the United Kingdom, The Institute of Historical Research, University of Brighton and University of Leiden . This presentation describes the probabilistic record linkage system that was developed for this task by the University of Leiden team and introduces some insightful examples of matches that our system was able to find in the Medieval and Early Modern data, as well as a number of experiments on artificial data to test the workings of the system.
Also, you may view her talk here: