DBpediaSameAs: An approach to tackling heterogeneity in DBpedia identifiers by Andre Valdestilhas
This work provides an approach to tackle heterogeneity about a problem where several transient owl:sameAs redundant occurrences were found in DBpedia identifiers during searching for owl:sameAs occurrences that were observed while finding of co-references between different data sets.
Thus, in this work there are 3 contributions in order to solve this problem: (1) DBpedia Unique Identifier, which was provided to obtain a normalization for owl:sameAs occurrences providing a unique DBpedia identifier instead of several transient owl:sameAs redundant occurrences, (2) Rate and suggest links, in order to improve the quality and also giving the possibility to have statistic data about the links, and (3) As a result of our work we were able to achieve a performance gain where the physical size has decreased from 16.2 GB to 6 GB triples and we also have the possibility to perform normalization and create an index.
The usability of the interface was evaluated by using a standard system of usability questionnaire. The positive results from all of our interviewed participants showed that the DBpediaSameAs property is easy to use and can thus lead to novel insights.
As proof of concept an implementation is provided in a computational web system, including a Service on the web and a Graphical User Interface.
Dynamic-LOD: An approach to count links using Bloom filters by Ciro Baron
The Web of Linked Data is growing and it becomes increasingly necessary to discover the relationship between different datasets.
Ciro Baron will present an approach for accurate link counting which uses Bloom filters (BF) to compare and approximately count links between datasets, solving the problem of lack of up-to-date meta-data about linksets. The paper which compare performance to classical approaches such as binary search tree (BST) and hash tables (HT) can be found in this link(http://svn.aksw.org/papers/2015/ISWC_DynLOD/public.pdf), and the results show that Bloom filter is 12x more efficient regarding of memory usage with adequate query speed performance.
In addition, Ciro will show a small cloud generated for all English DBpedia datasets and vocabularies available in Linked Open Vocabularies (LOV).
We evaluated Dynamic-LOD in three different aspects: firstly by analyzing data structure performance comparing BF with HS and BST, secondly a quantitative evaluation regarding false positives, speed to count links in a dense scenario like DBpedia and thirdly on a large scale based on lod-cloud distributions. In fact, all three evaluations indicates that BF is a good choice for what our work proposes.
About the AKSW Colloquium
This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.