DBpedia @ SEMANTiCS 2017

We are happy to invite you to the 10th DBpedia Community Meeting which will be held in Amsterdam. During the SEMANTiCS 2017, Sep 11-14, the DBpedia Community will get together on the 14th of September for the DBpdia Day.  

What cool things do you do with DBpedia? Present your tools and datasets at the DBpedia Community Meeting. Please submit your proposal in our form.


  • Keynote by Chris Welty (Google Research)
  • Keynote by Victor de Boer (VU University)
  • DBpedia Association Hour & Dutch DBpedia Hour
  • session on DBpedia ontology by members of the DBpedia ontology committee
  • DBpedia Tutorial Session (for people who want to learn about DBpedia)
  • We will talk with Mike Tung, CEO and founder from diffbot, about the DBpedia NLP department via videostream.


Attending the DBpedia Community Meeting costs €40 (excl. registration fee and VAT). DBpedia members get free admission, please contact your nearest DBpedia chapter or the DBpedia Association for a promotion code.  

Please check all details here.


If you can’t stand it till the end of the SEMANTiCS, you can already participate in the workshop “Two worlds, one goal: A Reliable Linked Data ecosystem for media” held by DBpedia and Wolters Kluwer on the 11th of September. This half-day workshop aims at exploring major topics for publishers and libraries from DBpedia’s and Wolters Kluwer’s perspective. Therefore, both communities will dive into core areas like Interlinking, Metadata and Data Quality and address challenges such as fundamental requirements when publishing data on the web. Did we spark your interest? Check our detailed program here and get your ticket today.

We are looking forward to meeting you in Amsterdam!

Posted in Uncategorized | Comments Off on DBpedia @ SEMANTiCS 2017

PRESS RELEASE: Amsterdam​ ​-​ ​this​ ​year’s​ ​hotspot​ ​​on Linked​ ​Data​ ​Strategies​ ​&​ ​Practices

SEMANTiCs LogoSeptember 11-14, 2017 international experts from science and industry demonstrate the business value of smart data services at SEMANTiCS 2017

Experts from science and industry meet at Europe’s biggest Linked Data and Semantic Web event to present and discuss latest achievements, challenges and future perspectives of new data management practices. The conference for Semantic Systems is now in its 13th edition and run by a mixed industry and research consortium built by Semantic Web Company (Austria), Institute for Applied Informatics (Germany), University of Applied Science St.Pölten (Austria) and the dutch partners VU, TNO and Kadaster, together with Wolters Kluwer as major industry sponsor.

Most companies and public administrations nowadays are struggling to catch up with new data management practices, either by initializing a data strategy from scratch or by adjusting their old strategy to the affordances of new technological environments, legal frameworks or business models. The Semantics conference gives insights into data management strategies, discusses cases of data-driven business models and gives advice on how to catch up with latest developments at the dawn of smart, networked data.

The exchange between industry and research is facilitated by a rich program consisting of six keynotes from companies like EA Games, Wolters Kluwer, and OTTO, followed by a total of 36 industry and 25 scientific presentations, 17 workshops, a poster and demo area and numerous social side events.

Programme​ ​Overview

September 11, 2017: Pre-Conference Workshops
September 12, 2017: Main Conference Day 1: Keynotes by Wolters Kluwer, EA Games and Toulouse Institute of Computer Science Research
September 13, 2017: Main Conference Day 2: Keynotes by OTTO & Ghent University
September 14, 2017: Post-Conference Workshops & DBpedia Day: Keynote by Chris Welty

This year’s conference focuses on the business value of Linked Data technologies and services as an enabling technology for a cost-efficient, flexible and sustainable enterprise data strategy.

This is addressed in the opening keynote of Sandeep Sacheti, Executive Vice President, Customer Information Management & Operational Excellence of Wolter Kluwer and in the management panel on Wednesday with Frank Tierolff (board member of Kadaster), Henk Jan Vink (director Networked Innovation of TNO), Kor Brandts (director DUO) and Michiel Borgers (Dutch Ministry of Finance).

The full and rich programme, with talks and presentations by leading researchers in the field and leading industry adopters, can be found at the conference page at http://2017.semantics.cc

SEMANTiCS​ ​2017​ ​Key​ ​Data

Date: 11-14 September 2017
Venue: Meervaart Theatre, Amsterdam, The Netherlands
Website: http://2017.semantics.cc
Programm: http://2017.semantics.cc/programme
Twitter: @semanticsconf
Contact: Dissemination Chair Arjen Santema
arjen.santema@kadaster.nl |+31 (0)652481774

View the full release here: PDF

Looking forward to see you at SEMANTiCs 2017.

Posted in Announcements, Events | Tagged , , , | Comments Off on PRESS RELEASE: Amsterdam​ ​-​ ​this​ ​year’s​ ​hotspot​ ​​on Linked​ ​Data​ ​Strategies​ ​&​ ​Practices

AKSW Colloquium, 01.09.2017, IDOL: Comprehensive & Complete LOD Insights

At the AKSW Colloquium on Friday 1st of September, at 10:40 AM there will be a paper presentation by Gustavo Publio. He will present the paper IDOL: Comprehensive & Complete LOD Insights, from Ciro Baron Neto, Dimitris Kontokostas, Amit Kirschenbaum, Gustavo Publio, Diego Esteves, and Sebastian Hellmann which will be presented in the upcoming SEMANTiCS’17 Conference in Amsterdam, Netherlands.


“Over the last decade, we observed a steadily increasing amount of
RDF datasets made available on the web of data. The decentralized
nature of the web, however, makes it hard to identify all these
datasets. Even more so, when downloadable data distributions are
discovered, only insufficient metadata is available to describe the
datasets properly, thus posing barriers on its usefulness and reuse.
In this paper, we describe an attempt to exhaustively identify the
whole linked open data cloud by harvesting metadata from multiple
sources, providing insights about duplicated data and the general
quality of the available metadata. This was only possible by using a
probabilistic data structure called Bloom Filter. Finally, we published
a dump file containing metadata which can further be used to enrich
existent datasets.”


About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/public/colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there are complimentary coffee and cake after the session.

Posted in Colloquium, paper presentation, Papers | Tagged , , , | Comments Off on AKSW Colloquium, 01.09.2017, IDOL: Comprehensive & Complete LOD Insights

AKSW at ISWC2017

We are very pleased to announce that AKSW will be presenting 2 papers at ISWC 2017, which will be held on 21-24 October in Vienna, Austria. The demo and workshops papers have to be announced.
The International Semantic Web Conference (ISWC) is the premier international forum where Semantic Web / Linked Data researchers, practitioners, and industry specialists come together to discuss, advance, and shape the future of semantic technologies on the web, within enterprises and in the context of the public institution.

Here is the list of the accepted paper with their abstract:

Distributed Semantic Analytics using the SANSA Stack” by Jens LehmannGezim SejdiuLorenz BühmannPatrick WestphalClaus Stadler, Ivan ErmilovSimon Bin, Muhammad Saleem, Axel-Cyrille Ngonga Ngomo and Hajira Jabeen.

Abstract:A major research challenge is to perform scalable analysis of large-scale knowledge graphs to facilitate applications like link prediction, knowledge base  completion and reasoning. Analytics methods which exploit expressive structures usually do not scale well to very large knowledge bases, and most analytics approaches which do scale horizontally (i.e., can be executed in a distributed environment) work on simple feature-vector-based input. This software framework paper describes the ongoing Semantic Analytics Stack (SANSA) project, which supports expressive and scalable semantic analytics by providing functionality for distributed computing on RDF data.

Iguana : A Generic Framework for Benchmarking the Read-Write Performance of Triple Stores” by Felix ConradsJens Lehmann, Axel-Cyrille Ngonga Ngomo, Muhammad Saleem, and Mohamed Morsey.

Abstract  :The performance of triples stores is crucial for applications which rely on RDF data. Several benchmarks have been proposed that assess the performance of triple stores. However, no integrated benchmark-independent execution framework for these benchmarks has been provided so far. We propose a novel SPARQL benchmark execution framework called IGUANA. Our framework complements benchmarks by providing an execution environment which can measure the performance of triple stores during data loading, data updates as well as under different loads. Moreover, it allows a uniform comparison of results on different benchmarks. We execute the FEASIBLE and DBPSB benchmarks using the IGUANA framework and measure the performance of popular triple stores under updates and parallel user requests. We compare our results with state-of-the-art benchmarking results and show that our benchmark execution framework can unveil new insights pertaining to the performance of triple stores.

Thank you and looking forward to see you at ISWC 2017.

These work were supported by the European Union’s H2020 research and innovation action HOBBIT (GA no. 688227), the European Union’s H2020 research and innovation program BigDataEurope (GA no.644564), German Ministry BMWI under the SAKE project (Grant No. 01MD15006E), WDAqua : Marie Skłodowska-Curie Innovative Training Network and Industrial Data Space.

Posted in Uncategorized | Comments Off on AKSW at ISWC2017

AKSW Colloquium, 07.07.2017, Two paper presentations concerning Link Discovery and Knowledge Base Reasoning

At the AKSW Colloquium on Friday 7th of July, at 10:40 AM there will be two paper presentations concerning genetic algorithms to learn linkage rules, and differentiable learning of logical rules for knowledge base reasoning.

Tommaso Soru will present the paper Differentiable Learning of Logical Rules for Knowledge Base Reasoning, currently a pre-print, by Fan Yang, Zhilin Yang, and William W. Cohen.


“We study the problem of learning probabilistic first-order logical rules for knowledge base reasoning. This learning problem is difficult because it requires learning the parameters in a continuous space as well as the structure in a discrete space. We propose a framework, Neural Logic Programming, that combines the parameter and structure learning of first-order logical rules in an end-to-end differentiable model. This approach is inspired by a recently-developed differentiable logic called TensorLog, where inference tasks can be compiled into sequences of differentiable operations. We design a neural controller system that learns to compose these operations. Empirically, our method obtains state-of-the-art results on multiple knowledge base benchmark datasets, including Freebase and WikiMovies.”

Daniel Obraczka will present the paper Learning Expressive Linkage Rules using Genetic Programming of Isele and Bizer accepted at VLDB 2012. This work presents an algorithm to learn record linkage rules utilizing genetic programming.


“A central problem in data integration and data cleansing is to find entities in different data sources that describe the same real-world object. Many existing methods for identifying such entities rely on explicit linkage rules which specify the conditions that entities must fulfill in order to be considered to describe the same real-world object. In this paper, we present the GenLink algorithm for learning expressive linkage rules from a set of existing reference links using genetic programming. The algorithm is capable of generating linkage rules which select discriminative properties for comparison, apply chains of data transformations to normalize property values, choose appropriate distance measures and thresholds and combine the results of multiple comparisons using non-linear aggregation functions. Our experiments show that the GenLink algorithm outperforms the state-of-the-art genetic programming approach to learning linkage rules recently presented by Carvalho et. al. and is capable of learning linkage rules which achieve a similar accuracy as human written rules for the same problem.”

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/public/colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Colloquium, paper presentation | Tagged , , , , , , | Comments Off on AKSW Colloquium, 07.07.2017, Two paper presentations concerning Link Discovery and Knowledge Base Reasoning

SANSA 0.2 (Semantic Analytics Stack) Released

The AKSW and Smart Data Analytics groups are happy to announce SANSA 0.2 – the second release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing for semantic technologies in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs.

You can find the FAQ and usage examples at http://sansa-stack.net/faq/.

The following features are currently supported by SANSA:

  • Reading and writing RDF files in N-Triples format
  • Reading OWL files in various standard formats
  • Querying and partitioning based on Sparqlify
  • RDFS/RDFS Simple/OWL-Horst forward chaining inference
  • RDF graph clustering with different algorithms
  • Rule mining from RDF graphs

Deployment and getting started:

  • There are template projects for SBT and Maven for Apache Spark as well as for Apache Flink available to get started.
  • The SANSA jar files are in Maven Central i.e. in most IDEs you can just search for “sansa” to include the dependencies in Maven projects.
  • There is example code for various tasks available.
  • We provide interactive notebooks for running and testing code via Docker.

We want to thank everyone who helped to create this release, in particular, the projects Big Data Europe,  HOBBIT , SAKE and Big Data Ocean.

SANSA Development Team

Posted in Uncategorized | Comments Off on SANSA 0.2 (Semantic Analytics Stack) Released

AKSW at ESWC 2017

Hello Community! The ESWC 2017 just ended and we give a short report of the course at the conference, especially regarding the AKSW-Group.

Our members Dr. Muhammad Saleem, Dr. Mohamed Ahmed Sherif, Claus Stadler, Michael Röder, Prof. Dr. Jens Lehmann and Edgard Marx participated at the conference. They held a number of presentations, workshops and tutorials:

Michael Röder

Mohamed Ahmed Sherif

Muhammad Saleem

Edgard Marx

  • Presented a Workshop paper „Exploring the Evolution and Provenance of Git Versioned RDF Data“ by Natanael Arndt, Patrick Naumann and Edgard Marx
  • Presented a demo paper „Kbox – Distributing ready-to-query RDF Knowledge Graphs“ by Edgard Marx, Tommaso Soru, Ciro Baron Neto and Sandro Coelho

Claus Stadler

  • Presented a Workshop paper in QuWeDa „JPA Criteria Queries over RDF Data“ by Claus Stadler, Jens Lehmann

The final versions of the papers from Edgard and Claus will be made available soon.

As every year the ESWC also awarded the best papers and studies in several categories. The award for Best Challenge Paper went to: “End-to-end Representation Learning for Question Answering with Weak Supervision” by Daniil Sorokin and Iryna Gurevych. The paper is part of the HOBBIT project by AKSW. Congrats to the winners!

Have a look at all the winners at ESWC 2017: http://2017.eswc-conferences.org/awards.

Posted in Announcements, Call for Paper, Events, HOBBIT, paper presentation, Papers | Tagged , , , , , , , | Comments Off on AKSW at ESWC 2017

Four papers accepted at WI 2017

Hello Community! We proudly announce that The International Conference on Web Intelligence (WI) accepted four papers by our group. The WI takes place in Leipzig between the 23th – 26th of August. The accepted papers are:

“An Evaluation of Models for Runtime Approximation in Link Discovery” by Kleanthi Georgala, Michael Hoffmann, and Axel-Cyrille Ngonga Ngomo.

Abstract: Time-efficient link discovery is of central importance to implement the vision of the Semantic Web. Some of the most rapid Link Discovery approaches rely internally on planning to execute link specifications. In newer works, linear models have been used to estimate the runtime the fastest planners. However, no other category of models has been studied for this purpose so far. In this paper, we study non-linear runtime estimation functions for runtime estimation. In particular, we study exponential and mixed models for the estimation of the runtimes of planners. To this end, we evaluate three different models for runtime on six datasets using 400 link specifications. We show that exponential and mixed models achieve better fits when trained but are only to be preferred in some cases. Our evaluation also shows that the use of better runtime approximation models has a positive impact on the overall execution of link specifications.

“CEDAL: Time-Efficient Detection of Erroneous Links in Large-Scale Link Repositories” by Andre Valdestilhas, Tommaso Soru and Axel-Cyrille Ngonga Ngomo.

Abstract: More than 500 million facts on the Linked Data Web are statements across knowledge bases. These links are of crucial importance for the Linked Data Web as they make a large number of tasks possible, including  cross-ontology, question answering and federated queries. However, a large number of these links are erroneous and can thus lead to these applications producing absurd results. We present a time-efficient and complete approach for the detection of erroneous links for properties that are transitive. To this end, we make use of the semantics of URIs on the Data Web and combine it with an efficient graph partitioning algorithm. We then apply our algorithm to the LinkLion repository and show that we can analyze 19,200,114 links in 4.6 minutes. Our results show that at least 13% of the owl:sameAs links we considered are erroneous. In addition, our analysis of the  provenance of links allows discovering agents and knowledge bases that commonly display poor linking. Our algorithm can be easily executed in parallel and on a GPU. We show that these implementations are up to two orders of magnitude faster than classical reasoners and a non-parallel implementation.

“LOG4MEX: A Library to Export Machine Learning Experiments” by Diego Esteves, Diego Moussallem, Tommaso Soru, Ciro Baron Neto, Jens Lehmann, Axel-Cyrille Ngonga Ngomo and Julio Cesar Duarte.

Abstract: A choice of the best computational solution for a particular task is increasingly reliant on experimentation. Even though experiments are often described through text, tables, and figures, their descriptions are often incomplete or confusing. Thus, researchers often have to perform lengthy web searches for reproducing and understanding the results. In order to minimize this gap, vocabularies and ontologies have been proposed for representing data mining and machine learning (ML) experiments. However, we still lack proper tools to export properly these metadata. To this end, we present an open-source library dubbed LOG4MEX which aims at supporting the scientific community to fulfill this gap.

“GENESIS – A Generic RDF Data Access Interface” by Tim Ermilov, Diego Moussallem, Ricardo Usbeck and Axel-Cyrille Ngonga Ngomo

Abstract: The availability of billions of facts represented in RDF on the Web provides novel opportunities for data discovery and access. In particular, keyword search and question answering approaches enable even lay people to access this data. However, the interpretation of the results of these systems, as well as the navigation through these results, remains challenging. In this paper, we present GENESIS, a generic RDF data access interface. GENESIS can be deployed on top of any knowledge base and search engine with minimal effort and allows for the representation of RDF data in a layperson-friendly way. This is facilitated by the modular architecture for reusable components underlying our framework. Currently, these include a generic search back-end, together with corresponding interactive user interface components based on a service for similar and related entities as well as verbalization services to bridge between RDF and natural language.

The final versions of the papers will be made available soon.

Come over to WI 2017 and enjoy the talks. More information on the program can be found here.

Posted in Announcements, paper presentation, Papers | Tagged , , , | Comments Off on Four papers accepted at WI 2017

AKSW Colloquium, 29.05.2017, Addressing open Machine Translation problems with Linked Data.

At the AKSW Colloquium, on Monday 29th of May 2017, 3 PM, Diego Moussallem will present two papers related to his topic. First paper titled “Using BabelNet to Improve OOV Coverage in SMT” of Du et al., which was presented at LREC 2016 and the second paper titled “How to Configure Statistical Machine Translation with Linked Open Data Resources” of Srivastava et al., which was presented at AsLing 2016.

Posted in Uncategorized | Comments Off on AKSW Colloquium, 29.05.2017, Addressing open Machine Translation problems with Linked Data.

SML-Bench 0.2 Released

Dear all,

we are happy to announce the 0.2 release of SML-Bench, our Structured Machine Learning benchmark framework. SML-Bench provides full benchmarking scenarios for inductive supervised machine learning covering different knowledge representation languages like OWL and Prolog. It already comes with adapters for prominent inductive learning systems like the DL-Learner, the General Inductive Logic Programming System (GILPS), and Aleph, as well as Inductive Logic Programming ‘classics’ like Golem and Progol. The framework is easily extensible, be it in terms of new benchmarking scenarios, or support for new learning systems. SML-Bench allows to define, run and report on benchmarks combining different scenarios and learning systems giving insight into the performance characteristics of the respective inductive learning algorithms on a wide range of learning problems.

Website: http://sml-bench.aksw.org/
GitHub page: https://github.com/AKSW/SML-Bench/
Change log: https://github.com/AKSW/SML-Bench/releases/tag/0.2

In the current release we extended the options to configure learning systems in the overall benchmarking configuration, and added support for running multiple instances of a learning system, as well as the nesting of instance-specific settings and settings that apply to all instances of a learning system. Besides internal refactoring to increase the overall software quality, we also extended the reporting capabilities of the benchmark results. We added a new benchmark scenario and experimental support for the Statistical Relational Learning system TreeLiker.

We want to thank everyone who helped to create this release and appreciate any feedback.

Best regards,

Patrick Westphal, Simon Bin, Lorenz Bühmann and Jens Lehmann

Posted in SMLBench, Software Releases | Comments Off on SML-Bench 0.2 Released