More Complete Resultset Retrieval from Large Heterogeneous RDF Sources

Over recent years, the Web of Data has grown significantly. Various interfaces such as LOD Stats, LOD Laundromat and SPARQL endpoints provide access to hundreds of thousands of RDF datasets, representing billions of facts. These datasets are available in different formats such as raw data dumps and HDT files, or directly accessible via SPARQL endpoints. Querying such a large amount of distributed data is particularly challenging and many of these datasets cannot be directly queried using the SPARQL query language.

In order to tackle these problems, We present WimuQ, an integrated query engine to execute SPARQL queries and retrieve results from a large amount of heterogeneous RDF data sources. Presently, WimuQ is able to execute both federated and non-federated SPARQL queries over a total of 668,166 datasets from LOD Stats and LOD Laundromat, as well as 559 active SPARQL endpoints. These data sources represent a total of 221.7 billion triples from more than 5 terabytes of information from datasets retrieved using the service “Where is My URI” (WIMU). Our evaluation of state-of-the-art real-data benchmarks shows that WimuQ retrieves more complete results for the benchmark queries. 

The contributions of this work are:

  • A hybrid SPARQL query-processing engine to execute SPARQL queries over a large amount of heterogeneous RDF data.
  • Evaluation of real-world datasets using the state of the art of federated and non-federated query benchmarks (FedBench, LargeRDFBench and FEASIBLE).
  • We present the first federated SPARQL query-processing engine that executes SPARQL queries over a total of 221.7 billion triples.

This is an ongoing work, in which the next step consists of a Large Scale approach to study the relation and similarity among the datasets. This work was supported by the Semantic Web group of HTWK Leipzig (https://www.htwk-leipzig.de/) under the advisement of Prof. Dr. rer. nat. Thomas Riechert.

Github repository: https://github.com/firmao/wimuT

Prototype/proof of concept: https://w3id.org/wimuq/

Slides: https://tinyurl.com/slidesKcap2019

Paper: https://dl.acm.org/citation.cfm?id=3364436

Conference: http://www.k-cap.org/2019/

Authors/Contact: valdestilhas@informatik.uni-leipzig.de, tsoru@informatik.uni-leipzig.de, saleem@informatik.uni-leipzig.de

Posted in paper presentation, Papers | Tagged , , | Comments Off on More Complete Resultset Retrieval from Large Heterogeneous RDF Sources

DL-Learner 1.4 (Supervised Structured Machine Learning Framework) Released

Dear all,

The Smart Data Analytics group [1] and the E.T.-db-MOLE sub-group located at the InfAI Leipzig [2] is happy to announce

DL-Learner 1.4.

DL-Learner is a framework containing algorithms for supervised machine learning in RDF and OWL. DL-Learner can use various RDF and OWL serialization formats as well as SPARQL endpoints as input, can connect to most popular OWL reasoners and is easily and flexibly configurable. It extends concepts of Inductive Logic Programming and Relational Learning to the Semantic Web in order to allow powerful data analysis.

Website: http://dl-learner.org
GitHub page: https://github.com/SmartDataAnalytics/DL-Learner
Download: https://github.com/SmartDataAnalytics/DL-Learner/releases/tag/1.4.0

In the current release, we continued to improve the code and work on our query tree and class expression learning algorithms. The config file can now optionally be written in Json syntax. We updated the packaging to be ready for Java 11 and also tested DL-Learner on Windows. Some logical fixes to the Horizontal Expansion in CELOE were reported and analysed by Yingbing Hua, thanks!

The DL-Learner system has also been presented at The Web Conference in Lyon 2018 [3]. We want to thank everyone who helped to create this release. We also acknowledge support by the following projects: LIMBO [4], QROWD [5], SAKE [6], Big Data Europe [7], HOBBIT [8], GeoKnow [9], GOLD [10], and SLIPO [11].

Kind regards,

Jens Lehmann, Lorenz Bühmann, Patrick Westphal and Simon Bin

[1] http://sda.tech
[2] https://infai.org/efficient-technology-integration/
[3] http://jens-lehmann.org/files/2018/www_dllearner.pdf
[4] https://www.limbo-project.org/
[5] http://qrowd-project.eu/
[6] https://www.sake-projekt.de/
[7] https://www.big-data-europe.eu/
[8] http://project-hobbit.eu/
[9] http://geoknow.eu/
[10] http://aksw.org/Projects/GOLD.html
[11] http://www.slipo.eu/

Posted in Announcements, DL-Learner, Software Releases, Uncategorized | Comments Off on DL-Learner 1.4 (Supervised Structured Machine Learning Framework) Released

DBpedia Day @ SEMANTiCS 2019

 We are happy to announce that SEMANTiCS 2019 will host the 14th DBpedia Community Meeting at the last day of the conference on September 12, 2019.

 

 

Highlights/Sessions

  • Keynote #1: Katja Hose, Aalborg University, Denmark
  • Keynote #2: Dan Weitzner from WPSemantix
  • DBpedia Databus presentation and training session
  • DBpedia Association hour
  • DBpedia Showcase session
  • DBpedia Chapter session

Call for Contribution

Tell us what cool things you do with DBpedia:  Present your tools and datasets at the DBpedia Community Meeting! Please submit your presentations, posters, demos or other forms of contributions through our web form.

Quick Facts

  • Web URL: https://wiki.dbpedia.org/events/14th-dbpedia-community-meeting-karlsruhe
  • When: September 12th, 2019
  • Where: Leibniz-Institute für Informationsstruktur – FIZ Karlsruhe, Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
  • Call for Contribution: Submit your proposal in our form
  • Registration: Attending the DBpedia Community meeting costs 90 €. You can buy your ticket on the SEMANTiCS website. DBpedia members get free admission. Please contact your nearest DBpedia chapter for a promotion code, or please contact the DBpedia Association.

Sponsors and Acknowledgments

In case you want to sponsor the 14th DBpedia Community Meeting, please contact the DBpedia Association via dbpedia@infai.org.

Organisation

  • Tina Schmeissner, DBpedia Association
  • Sandra Prätor, AKSW/KILT, DBpedia Association
  • Sebastian Hellmann, AKSW/KILT, DBpedia Association

We are looking forward to meeting you in Karlsruhe!

Your DBpedia Association

Posted in Call for Paper, Call for Students, dbpedia, Events | Tagged , , | Comments Off on DBpedia Day @ SEMANTiCS 2019

LDK conference @ University of Leipzig

With the advent of digital technologies, an ever-increasing amount of language data is now available across various application areas and industry sectors, thus making language data more and more valuable. In that context, we are happy to invite you to join the 2nd Language, Data and Knowledge (LDK) conference which will be held in Leipzig from May 20th till 22nd, 2019.

This new biennial conference series aims at bringing together researchers from across disciplines concerned with language data in data science and knowledge-based applications.

In that context, the acquisition, provenance, representation, maintenance, usability, quality as well as legal, organizational and infrastructure aspects of language data are in the centre of research revolving around language data and thus constitute the focus of the conference.

To register and be part of the LDK conference and its associated events, please go to http://2019.ldk-conf.org/registration/.

Keynote Speakers

  • Keynote #1: Christian Bizer, Mannheim University
  • Keynote #2: Christiane Fellbaum, Princeton University
  • Keynote #3: Eduard Werner, Leipzig University

Associated Events

The following events are co-located with LDK 2019:

Workshops on the 20th May 2019

DBpedia Community Meeting on the 23rd May 2019

Looking forward to meeting you at the conference!

Posted in Announcements, dbpedia, Events | Comments Off on LDK conference @ University of Leipzig

13th DBpedia community meeting in Leipzig

We are happy to invite you to join the 13th edition of the DBpedia Community Meeting, which will be held in Leipzig. Following the LDK conference, May 20-22, the DBpedia Community will get together on May 23rd, 2019 at Mediencampus Villa Ida. Once again the meeting will be accompanied by a varied program of exciting lectures and showcases.

Highlights/ Sessions

  • Keynote #1: Making Linked Data Fun with DBpedia by Peter Haase, metaphacts
  • Keynote #2: From Wikipedia to Thousands of Wikis – The DBkWik Knowledge Graph by Heiko Paulheim, Universität Mannheim
  • NLP and DBpedia Session
  • DBpedia Association Hour
  • DBpedia Showcase Session

Call for Contribution

What cool things do you do with DBpedia? Present your tools and datasets at the DBpedia Community Meeting! Please submit your presentations, posters, demos or other forms of contributions through our web form.

Tickets

Attending the DBpedia Community meeting costs 40 €. You need to buy a ticket via eshop.sachsen.de. DBpedia members get free admission. Please contact your nearest DBpedia chapter for a promotion code, or please contact the DBpedia Association.

If you would like to attend the LDK conference, please register here.

We are looking forward to meeting you in Leipzig!

Posted in dbpedia, Events | Comments Off on 13th DBpedia community meeting in Leipzig

SANSA 0.5 (Semantic Analytics Stack) Released

We are happy to announce SANSA 0.5 – the fifth release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs.

You can find the FAQ and usage examples at http://sansa-stack.net/faq/.

The following features are currently supported by SANSA:

  • Reading and writing RDF files in N-Triples, Turtle, RDF/XML, N-Quad format
  • Reading OWL files in various standard formats
  • Query heterogeneous sources (Data Lake) using SPARQL – CSV, Parquet, MongoDB, Cassandra, JDBC (MySQL, SQL Server, etc.) are supported
  • Support for multiple data partitioning techniques
  • SPARQL querying via Sparqlify and Ontop
  • Graph-parallel querying of RDF using SPARQL (1.0) via GraphX traversals (experimental)
  • RDFS, RDFS Simple and OWL-Horst forward chaining inference
  • RDF graph clustering with different algorithms
  • Terminological decision trees (experimental)
  • Knowledge graph embedding approaches: TransE (beta), DistMult (beta)

Noteworthy changes or updates since the previous release are:

  • A data lake concept for querying heterogeneous data sources has been integrated into SANSA
  • New clustering algorithms have been added and the interface for clustering has been unified
  • Ontop RDB2RDF engine support has been added
  • RDF data quality assessment methods have been substantially improved
  • Dataset statistics calculation has been substantially improved
  • Improved unit test coverage

Deployment and getting started:

  • There are template projects for SBT and Maven for Apache Spark as well as for Apache Flink available to get started.
  • The SANSA jar files are in Maven Central i.e. in most IDEs you can just search for “sansa” to include the dependencies in Maven projects.
  • Example code is available for various tasks.
  • We provide interactive notebooks for running and testing code via Docker.

We want to thank everyone who helped to create this release, in particular the projects HOBBIT, Big Data Ocean, SLIPO, QROWD, BETTER, BOOST, MLwin and Simple-ML.

Spread the word by retweeting our release announcement on Twitter. For more updates, please view our Twitter feed and consider following us.

Greetings from the SANSA Development Team

 

Posted in Uncategorized | Comments Off on SANSA 0.5 (Semantic Analytics Stack) Released

AKSW at web.br in São Paulo

From October 1st until 6th a delegation from AKSW Group, Leipzig University of Applied Sciences (HTWK), eccenca GmbH, and Max Planck Institute for Human Cognitive and Brain Sciences went to São Paulo, Brazil to meet people from the Web Technologies Study Center (ceweb.br) for evaluation future collaboration.
For getting to know our mutual research interests we held the Workshop on Linked Data Management.


The Workshop on Linked Data Management (Workshop sobre Gestão de Dados Abertos) was co-located with the annual conference of the Brazilian Word Wide Web Consortium (Conferencia web.br 2018) in São Paulo.
During the workshop 11 talks were held by researchers from the Brazilian hosts and the German delegation.
By mutually presenting our research areas, open questions, and visions to the audience overlapping research interests and complementing areas of expertise could be identified.
A recurring hypothesis was that Open Data is a very powerful method to foster participation, accessibility, and collaboration across areas.
During the presentations the potential in the areas of research data in the digital humanities, the accessibility of educational resources and organization of educational infrastructures, and the participation in public administration and government became visible.
A recurring topic in the presentations was the need for collaboration among actors and stakeholders which arises the need for methodologies and systems for supporting the collaboration.
Asset for a potential future cooperation in this research area between the Brazilian and the German side were the mutually complementing interests and experiences of the groups.
The Brazilian side has an existing involvement with public administration, government, and education particularly with the special needs from a developing country perspective.
On the German side a strong background in the creation and operation of data management systems and infrastructures, as well as data integration exists.
We are currently in the process of establishing useful communication channels and collaboration platforms which allow efficient joint work across timezone, language, and continental borders, to foster the cooperation between the two groups.
For a common understanding of our interests and skills the first subject of collaboration is an common extended documentation of the initial workshop. Following this documentation a requirements engineering process will be started identify the concrete needs and potentials on both sides for a common project in future.
The second workshop planed in June 2019 will focus on the results of this discussion.

After the workshop we have also visited the DFG Office in Latin America to discuss possible research collaboration between German institutions and institutions in São Paulo.

The Open Data Management Workshop and the visit of the German delegation is funded by the German Research Foundation (DFG) in cooperation with the São Paulo Research Foundation (FAPESP) under grant agreement number 388784229.

Also read about our trip at the HTWK news portal (German).

Posted in Projects, workshop | Tagged , , , , , , , , | Comments Off on AKSW at web.br in São Paulo

AskNow 0.1 Released

Dear all,

we are very happy to announce AskNow 0.1 – the initial release of Question Answering Components and Tools over RDF Knowledge Graphs.

Website: http://asknow.sda.tech/
Demo: http://asknowdemo.sda.tech
GitHub: https://github.com/AskNowQA

The following components with corresponding features are currently supported by AskNow:

  • AskNow UI 0.1: The UI interface works as a platform for users to pose their questions to the AskNow QA system. The UI displays the answers based on whether the answer is an entity or a list of entities, boolean or literal. For entities it shows the abstracts from DBpedia.
    Github: https://github.com/AskNowQA/AskNowUI

We want to thank everyone who helped to create this release, in particular the projects HOBBIT, SOLIDE, WDAqua, BigDataEurope.

View this announcement on Twitter: https://twitter.com/AskNowQA/status/1040205350853599233

Kind regards,
The AskNow Development Team
(http://asknow.sda.tech/people/)

Posted in Announcements | Comments Off on AskNow 0.1 Released

Jekyll RDF Tutorial Screencast

Since 2016 we are developing Jekyll-RDF a plugin for the famous Jekyll–static website generator. With Jekyll-RDF we took the slogan of Jekyll “Transform your plain text into static websites and blogs” and transformed it to “Transform your RDF Knowledge Graph into static websites and blogs”. This enables people without deep programming knowledge to publish data, which is encoded in complicated RDF structures, on the web in an easy to browse format.

To ease your start with Jekyll-RDF I’ve created a Tutorial Screencast that teaches you all the basics necessary to create a simple Jekyll page from an RDF knowledgebase. I hope that you enjoy it and that it is helpful for you!

crosspost: https://natanael.arndt.xyz/2018/08/07/jekyll-rdf-tutorial-screencast

Posted in Announcements, LEDS | Tagged , , , , , , | Comments Off on Jekyll RDF Tutorial Screencast

DBpedia Day @ SEMANTiCS 2018

Don’t miss the 12th edition of the DBpedia Community Meeting in Vienna, the city with the highest quality of life in the world. The DBpedia Community will get together for the DBpedia Day on September 10, the first day of the SEMANTiCS Conference which will be held from September 10 to 13, 2018.


What cool things do you do with DBpedia? Present your tools and datasets at the DBpedia Community Meeting! Please submit your presentations, posters, demos or other forms of contributions through our web form.


Highlights/Sessions

  • Keynote#1: Dealing with Open Domain Data by Javier David Fernández García (WU)
  • Keynote #2: Linked Open Data cloud – act now before it’s too late by Mathieu d’Aquin (NUI Galway)
  • DBpedia Showcase Session
  • DBpedia Association Hour
  • Special Chapter Session with DBpedia language chapters from different parts of Europe

Tickets

  • Attending the DBpedia Community Meeting costs €50 (excl. registration fee and VAT). DBpedia members get free admission. Please contact your nearest DBpedia chapter or the DBpedia Association for a promotion code.
  • Please check all details here!

We are looking forward to meeting you in Vienna!

 

Posted in dbpedia, Events | Comments Off on DBpedia Day @ SEMANTiCS 2018