AKSW Colloquium, 13th February, 3pm, Evaluating Entity Linking

Michael Roeder On the 13th of February at 3 PM, Michael Röder will present the two papers “Evaluating Entity Linking: An Analysis of Current Benchmark Datasets and a Roadmap for Doing a Better Job” of van Erp et al. and “Moving away from semantic overfitting in disambiguation datasets” of Postma et al. in P702.

Abstract 1

Entity linking has become a popular task in both natural language processing and semantic web communities. However, we find that the benchmark datasets for entity linking tasks do not accurately evaluate entity linking systems. In this paper, we aim to chart the strengths and weaknesses of current benchmark datasets and sketch a roadmap for the community to devise better benchmark datasets.

Abstract 2

Entities and events in the world have no frequency, but our communication about them and the expressions we use to refer to them do have a strong frequency profile. Language expressions and their meanings follow a Zipfian distribution, featuring a small amount of very frequent observations and a very long tail of low frequent observations. Since our NLP datasets sample texts but do not sample the world, they are no exception to Zipf’s law. This causes a lack of representativeness in our NLP tasks, leading to models that can capture the head phenomena in language, but fail when dealing with the long tail. We therefore propose a referential challenge for semantic NLP that reflects a higher degree of ambiguity and variance and captures a large range of small real-world phenomena. To perform well, systems would have to show deep understanding on the linguistic tail.

The papers are available at lrec-conf.org and aclweb.org.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Uncategorized | Comments Off on AKSW Colloquium, 13th February, 3pm, Evaluating Entity Linking

SLIPO project kick-off meeting

SLIPO, a new InfAI project kicked-off between the 18th and 20th of January in Athens, Greece. Funded by the EU-program “Horizon 2020”, the project is planned to have an operational time until the 31st of December 2019.

Scalable Linking and Integration of Big POI Data (SLIPO) has the goal to transfer the output of the GeoKnow researches to certain challenges of POI data, which becomes more and more indispensable for issues in the fields of tracking, logistics and tourism. Furthermore we are scheduling to improve the scalability of our key research frameworks, such as LIMES, DEER or LinkedGeoData.

For or closer look please visit: http://aksw.org/Projects/SLIPO.html

Our partners through this process are: 

This project has received funding from the European Union’s H2020 research and innovation action program under grant agreement number 731581.

Posted in Announcements, Kickoff, SLIPO | Comments Off on SLIPO project kick-off meeting

AKSW Colloquium 30.Jan.2017

In the upcoming Colloquium, Simon Bin will discuss the paper “SimonTowards Analytics Aware Ontology Based Access to Static and Streaming Data” by Evgeny Kharlamov et.al. that has been presented at ISWC2017.

  Abstract

Real-time analytics that requires integration and aggregation of heterogeneous and distributed streaming and static data is a typical task in many industrial scenarios such as diagnostics of turbines in Siemens. OBDA approach has a great potential to facilitate such tasks; however, it has a number of limitations in dealing with analytics that restrict its use in important industrial applications. Based on our experience with Siemens, we argue that in order to overcome those limitations OBDA should be extended and become analytics, source, and cost aware. In this work we propose such an extension. In particular, we propose an ontology, mapping, and query language for OBDA, where aggregate and other analytical functions are first class citizens. Moreover, we develop query optimisation techniques that allow to efficiently process analytical tasks over static and streaming data. We implement our approach in a system and evaluate our system with Siemens turbine data.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/public/colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Colloquium | Comments Off on AKSW Colloquium 30.Jan.2017

AKSW Colloquium, 23.01.2017, Automatic Mappings of Tables to Knowledge Graphs and Open Table Extraction

Automatic Mappings of Tables to Knowledge Graphs and Open Table Extraction

On the upcoming colloquium on 23.01.2017, Ivan Ermilov will present his work on automatic mappings of tables to knowledge graphs, which was published as TAIPAN: Automatic Property Mapping for Tabular Data on EKAW’2016 conference, as well as extension of this work including:

  • Open Table Extraction (OTE) approach, i.e. how to generate meaningful information from a big corpus of tables.
  • How to benchmark OTE and which benchmarks are available.
  • OTE use cases and applications.

 

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Colloquium, PHD progress report, PhD topic | Comments Off on AKSW Colloquium, 23.01.2017, Automatic Mappings of Tables to Knowledge Graphs and Open Table Extraction

PRESS RELEASE: “HOBBIT so far.” is now available

cropped-Hobbit_Logo_Claim_2015_rgb_300_130The latest release informs about the conferences our team attended in 2016 as well as about the published blogposts. Furthermore it gives a short analysis about the survey by which we are able to verify requirements of our benchmarks and the new HOBBIT plattform. Last but not least the release gives a short outlook to our plans in 2017 including the founding of the HOBBIT association.

Have a look at the whole press release on the HOBBIT website .

Posted in Announcements, HOBBIT, Press Release, Projects | Comments Off on PRESS RELEASE: “HOBBIT so far.” is now available

4th Big Data Europe Plenary at Leipzig University

bde_vertical

The meeting, hosted by our partner InfAI e. V., took place on the 14th to the 15th of December at the University of Leipzig.
The 29 attendees in total, including 15 partners, discussed and reviewed the progress of all work packages in 2016 and planned the activities and workshops taking place in the next six months.

On the second day we talked about several societal challenge pilots in the fields of AgroKnow, transport, security etc. It’s been the last plenary for this year and we thank everybody for their work in 2016. Big Data Europa and our partners are looking forward to 2017.

The next Plenary Meeting will be hosted by VU Amsterdam and will take place in Amsterdam, in June 2017.

Posted in Announcements, BigDataEurope, Projects | Comments Off on 4th Big Data Europe Plenary at Leipzig University

SANSA 0.1 (Semantic Analytics Stack) Released

Dear all,

The Smart Data Analytics group /AKSW are very happy to announce SANSA 0.1 – the initial release of the Scalable Semantic Analytics Stack. SANSA combines distributed computing and semantic technologies in order to allow powerful machine learning, inference and querying capabilities for large knowledge graphs.

Website: http://sansa-stack.net
GitHub: https://github.com/SANSA-Stack
Download: http://sansa-stack.net/downloads-usage/
ChangeLog: https://github.com/SANSA-Stack/SANSA-Stack/releases

You can find the FAQ and usage examples at http://sansa-stack.net/faq/.

The following features are currently supported by SANSA:

  • Support for reading and writing RDF files in N-Triples format
  • Support for reading OWL files in various standard formats
  • Querying and partitioning based on Sparqlify
  • Support for RDFS/RDFS Simple/OWL-Horst forward chaining inference
  • Initial RDF graph clustering support
  • Initial support for rule mining from RDF graphs

We want to thank everyone who helped to create this release, in particular, the projects Big Data Europe, HOBBIT and SAKE.

Kind regards,

The SANSA Development Team

Posted in SANSA | Comments Off on SANSA 0.1 (Semantic Analytics Stack) Released

AKSW wins award for Best Resources Paper at ISWC 2016 in Japan

iswc2016Our paper, “LODStats: The Data Web Census Dataset”, won the award for Best Resources Paper at the recent conference in Kobe/Japan, which was the premier international forum for Semantic Web and Linked Data Community. The paper presents the LODStats dataset, which provides a comprehensive picture of the current state of a significant part of the Data Web.

Congrats to  Ivan Ermilov, Jens Lehmann, Michael Martin and Sören Auer.

Please find the complete list of winners here.

 

Posted in Announcements, Papers | Comments Off on AKSW wins award for Best Resources Paper at ISWC 2016 in Japan

AKSW Colloquium, 28.11.2016, NED using PBOH + Large-Scale Learning of Relation-Extraction Rules.

In the upcoming Colloquium, November the 28th at 3 PM, two papers will be presented:

Probabilistic Bag-Of-Hyperlinks Model for Entity Linking

Diego Moussallem will discuss the paper “Probabilistic Bag-Of-Hyperlinks Model for Entity Linking” by Octavian-Eugen Ganea et. al. which was accepted at WWW 2016.

Abstract:  Many fundamental problems in natural language processing rely on determining what entities appear in a given text. Commonly referenced as entity linking, this step is a fundamental component of many NLP tasks such as text understanding, automatic summarization, semantic search or machine translation. Name ambiguity, word polysemy, context dependencies and a heavy-tailed distribution of entities contribute to the complexity of this problem. We here propose a probabilistic approach that makes use of an effective graphical model to perform collective entity disambiguation. Input mentions (i.e., linkable token spans) are disambiguated jointly across an entire document by combining a document-level prior of entity co-occurrences with local information captured from mentions and their surrounding context. The model is based on simple sufficient statistics extracted from data, thus relying on few parameters to be learned. Our method does not require extensive feature engineering, nor an expensive training procedure. We use loopy belief propagation to perform approximate inference. The low complexity of our model makes this step sufficiently fast for real-time usage. We demonstrate the accuracy of our approach on a wide range of benchmark datasets, showing that it matches, and in many cases outperforms, existing state-of-the-art methods

Large-Scale Learning of Relation-Extraction Rules with Distant Supervision from the Web

Afterward, René Speck will present the paper “Large-Scale Learning of Relation-Extraction Rules with
Distant Supervision from the Web”
by Sebastian Krause et. al. which was accepted at ISWC 2012.

Abstract: We present a large-scale relation extraction (RE) system which learns grammar-based RE rules from the Web by utilizing large numbers of relation instances as seed. Our goal is to obtain rule sets large enough to cover the actual range of linguistic variation, thus tackling the long-tail problem of real-world applications. A variant of distant supervision learns several relations in parallel, enabling a new method of rule filtering. The system detects both binary and n-ary relations. We target 39 relations from Freebase, for which 3M sentences extracted from 20M web pages serve as the basis for learning an average of 40K distinctive rules per relation. Employing an efficient dependency parser, the average run time for each relation is only 19 hours. We compare these rules with ones learned from local corpora of different sizes and demonstrate that the Web is indeed needed for a good coverage of linguistic variation

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Uncategorized | Comments Off on AKSW Colloquium, 28.11.2016, NED using PBOH + Large-Scale Learning of Relation-Extraction Rules.

Accepted paper in AAAI 2017

aaai-bannerHello Community! We are very pleased to announce that our paper “Radon– Rapid Discovery of Topological Relations” was accepted for presentation at the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), which will be held in February 4–9 at the Hilton San Francisco, San Francisco, California, USA.

In more detail, we will present the following paper: “Radon– Rapid Discovery of Topological Relations” Mohamed Ahmed Sherif, Kevin Dreßler, Panayiotis Smeros, and Axel-Cyrille Ngonga Ngomo

Abstract. Datasets containing geo-spatial resources are increasingly being represented according to the Linked Data principles. Several time-efficient approaches for discovering links between RDF resources have been developed over the last years. However, the time-efficient discovery of topological relations between geospatial resources has been paid little attention to. We address this research gap by presenting Radon, a novel approach for the rapid computation of topological relations between geo-spatial resources. Our approach uses a sparse tiling index in combination with minimum bounding boxes to reduce the computation time of topological relations. Our evaluation of Radon’s runtime on 45 datasets and in more than 800 experiments shows that it outperforms the state of the art by up to 3 orders of magnitude while maintaining an F-measure of 100%. Moreover, our experiments suggest that Radon scales up well when implemented in parallel.

Acknowledgments
This work is implemented in the link discovery framework LIMES and has been supported by the European Union’s H2020 research and innovation action HOBBIT (GA no. 688227) as well as the BMWI Project GEISER (project no. 01MD16014E).

Posted in Uncategorized | Comments Off on Accepted paper in AAAI 2017