Hello Community! We are very pleased to announce the acceptance of two papers in ESWC 2017 research track. The ESWC 2017 is to be held in Portoroz, Slovenia from 28th of May to the 1st of June. In more detail, we will present the following papers:
- “WOMBAT – A Generalization Approach for Automatic Link Discovery” Mohamed Ahmed Sherif, Axel-Cyrille Ngonga Ngomo, Jens Lehmann
Abstract. A significant portion of the evolution of Linked Data datasets lies in updating the links to other datasets. An important challenge when aiming to update these links automatically under the open-world assumption is the fact that usually only positive examples for the links exist. We address this challenge by presenting and evaluating WOMBAT , a novel approach for the discovery of links between knowledge bases that relies exclusively on positive examples. WOMBAT is based on generalisation via an upward refinement operator to traverse the space of link specification. We study the theoretical characteristics of WOMBAT and evaluate it on 8 different benchmark datasets. Our evaluation suggests that WOMBAT outperforms state-of-the-art supervised approaches while relying on less information. Moreover, our evaluation suggests that WOMBAT’s pruning algorithm allows it to scale well even on large datasets.
- “All That Glitters is not Gold – Rule-Based Curation of Reference Datasets for Named Entity Recognition and Entity Linking” Kunal Jha, Michael Röder and Axel-Cyrille Ngonga Ngomo
Abstract. The evaluation of Named Entity Recognition as well as Entity Linking systems is mostly based on manually created gold standards. However, the current gold standards have three main drawbacks. First, they do not share a common set of rules pertaining to what is to be marked and linked as an entity. Moreover, most of the gold standards have not been checked by other researchers after they have been published and hence commonly contain mistakes. Finally, they lack actuality as in most cases the reference knowledge base used to link the entities has been refined over time while the gold standards are typically not updated to the newest version of the reference knowledge base. In this work, we analyze existing gold standards and derive a set of rules for annotating documents for named entity recognition and entity linking. We derive Eaglet, a tool that supports the semi-automatic checking of a gold standard based on these rules. A manual evaluation of Eaglet’s results shows that it achieves an accuracy of up to 88% when detecting errors. We apply Eaglet to 13 gold standards and detect 38,453 errors. An evaluation of 10 tools on a subset of these datasets shows a performance difference of up to 10% micro F-measure on average.
This work is has been supported by the European Union’s H2020 research and innovation action HOBBIT (GA no. 688227), the European Union’s H2020 research and innovation action SLIPO (GA no. 731581), the BMWI Project SAKE (project no. 01MD15006E), the BmBF project DIESEL (project no. 01QE1512C) and the BMWI Project GEISER (project no. 01MD16014E).