Should I publish my dataset under an open license?

Undecided, stand back we know flowcharts:

Did you ever try to apply the halting problem to a malformed flowchart?

 

Taken from my slides for my keynote  at TKE:

Posted in Announcements, best practices in the Web of Data | Comments Off on Should I publish my dataset under an open license?

TKE 2016 has announced their invited speakers

Sebastian HellmannThe 12th International Conference on Terminology and Knowledge Engineering (TKE 2016) has announced their invited speakers, including Dr. Sebastian Hellmann, Head of the AKSW/KILT research group at Leipzig University and Executive Director of the DBpedia Association at the Institut for Applied Informatics (InfAI) e.V.. Sebastian Hellman will give a talk on Challenges, Approaches and Future Work for Linguistic Linked Open Data (LLOD).

The theme of the 12th International Conference on Terminology and Knowledge Engineering will be ‘Term Bases and Linguistic Linked Open Data’. So the main aims of TKE 2016 will be to bring together researchers from these related fields, provide an overview of the state-of-the-art, discuss problems and opportunities, and exchange information. TKE 2016 will also cover applications, ongoing and planned activities, industrial uses and needs, as well as requirements coming from the new e-society.

DownloadThe TKE 2016 conference will take place in Copenhagen, Denmark, between 22-24 June, 2016. Further information about the program and speakers confirmed so far can be found at the conference website.

 

Posted in Announcements, Events, invited talk | Comments Off on TKE 2016 has announced their invited speakers

Two Papers accepted at ECAI 2016

Ecai-2016Hello Community! We are very pleased to announce that two of our papers were accepted for presentation at the biennial European Conference on Artificial Intelligence (ECAI). ECAI is Europe’s premier venue for presenting scientific results in AI and will be held from August 29th to September 02nd in The Hague, Netherlands.

 

In more detail, we will present the following papers:

An Efficient Approach for the Generation of Allen Relations                     (Kleanthi Georgala, Mohamed Sherif, Axel-Cyrille Ngonga Ngomo)

Abstract: Event data is increasingly being represented according to the Linked Data principles. The need for large-scale machine learning on data represented in this format has thus led to the need for efficient approaches to compute RDF links between resources based on their temporal properties. Time-efficient approaches for computing links between RDF resources have been developed over the last years. However, dedicated approaches for linking resources based on temporal relations have been paid little attention to. In this paper, we address this research gap by presenting AEGLE, a novel approach for the efficient computation of links between events according to Allen’s interval algebra. We study Allen’s relations and show that we can reduce all thirteen relations to eights simpler relations. We then present an efficient algorithm with a complexity of O(n log n) for computing these eight relations. Our evaluation of the runtime of our algorithms shows that we outperform the state of the art by up to 4 orders of magnitude while maintaining a precision and a recall of 100%.

Towards SPARQL-Based Induction for Large-Scale RDF Data sets             (Simon Bin, Lorenz Bühmann, Jens Lehmann, Axel-Cyrille Ngonga Ngomo)

Abstract: We show how to convert OWL Class Expressions to SPARQL queries where the instances of that concept are — with restrictions sensible in the considered concept induction scenario — equal to the SPARQL query result.  Furthermore, we implement and integrate our converter into the CELOE algorithm (Class Expression Learning for Ontology Engineering). Therein, it replaces the position of a traditional OWL reasoner, which most structured machine learning approaches assume knowledge to be loaded into. This will foster the application of structured machine learning to the Semantic Web, since most data is readily available in triple stores. We provide experimental evidence for the usefulness of the bridge. In particular, we show that we can improve the runtime of machine learning approaches by several orders of magnitude. With these results, we show that machine learning algorithms can now be executed on data on which in-memory reasoners could not be  use previously possible.

Come over to ECAI and enjoy the talks. For more information on the conference program and other papers please see here.

Sandra on behalf of AKSW

Posted in Announcements, Call for Paper | Comments Off on Two Papers accepted at ECAI 2016

AKSW Colloquium, 13.06.2016, SPARQL query processing with Apache Spark

In the upcoming Colloquium, Simon Bin will discuss the paper “SimonSPARQL query processing with Apache Spark” by H. Naacke et.al. that has been submitted to ISWC2016.  Abstract

The number of linked data sources and the size of the linked open data graph keep growing every day.  As a consequence, semantic RDF services are more and more confronted to various big data problems.  Query processing is one of them and needs to be efficiently addressed with executions over scalable, highly available and fault tolerant frameworks.  Data management systems requiring these properties are rarely built from scratch but are rather designed on top of an existing cluster computing engine.  In this work, we consider the processing of SPARQL queries with Apache Spark.
We propose and compare five different query processing approaches based on different join execution models and Spark components.  A detailed experimentation, on real-world and synthetic data sets, emphasizes that two approaches tailored for the RDF data model outperform the other ones on all major query shapes, i.e star, snowflake, chain and hybrid.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Colloquium | Comments Off on AKSW Colloquium, 13.06.2016, SPARQL query processing with Apache Spark

AKSW at ESWC 2016

ESWC2016-Logo-Web-S_0

We are very pleased to report that 4 of our papers were accepted for presentation as full papers at ESWC 2016. These are

In addition, we organised the first HOBBIT community meeting. Many thanks to all who participated. Get involved in the project by going here. Our survey pertaining to benchmarking is still open and we’d love to have your feedback on what you would want benchmarking Linked Data to look like.

We also presented  three research projects, i.e., HOBBIT, QAMEL and DIESEL during the EU networking sessions. Many thanks for the fruitful discussions and ideas.

Finally, we thank all the systems which participated to QALD-6 and OKE and made these challenges so interesting. Little perk: We are still to find a system to beat CETUS at the OKE challenge :)

FYI, a full list of accepted conference papers can be found here.

Workshops

In addition to the main conference, we were active during the workshops. Axel gave the keynote at the Profiles workshop (many thanks to the organizers for the invite). The following papers were accepted as full papers.

  • DBtrends : Publishing and Benchmarking RDF Ranking Functions by Edgard Marx, Amrapali J. Zaveri, Mofeed Mohammed, Sandro Rautenberg, Jens Lehmann, Axel-Cyrille Ngonga Ngomo and Gong Cheng, SumPre2016 Workshop at ESWC 2016
  • Towards Sustainable view-based Extract-Transform-Load (ETL) Fusion of Open Data by Kay Mueller, Claus Stadler, Ritesh Kumar Singh and Sebastian Hellmann, LDQ2016 [pdf]
  • UPSP: Unique Predicate-based Source Selection for SPARQL Endpoint Federation by Ethem Cem Ozkan, Muhammad Saleem, Erdogan Dogdu and Axel-Cyrille Ngonga Ngomo  PROFILES Workshop at ESWC 2016 [pdf]
  • Federated Query Processing: Challenges and Opportunities by Axel-Cyrille Ngonga Ngomo and Muhammad Saleem Keynote at PROFILES Workshop at ESWC 2016 [pdf]

Quo Vadis?

We are now looking forward to EDF 2016, where we will present HOBBIT as a poster as well as organise a post conference event (see http://project-hobbit.eu/edf2016). Thereafter, you can meet us at  ISWC 2016, where we will present two tutorials (Link Discovery and Federated SPARQL queries) and organise the BLINK [[http://project-hobbit.eu/events/blink-2016/]] workshop. Your submissions are welcome.

 

Posted in Uncategorized | Comments Off on AKSW at ESWC 2016

AKSW@LREC2016

Since the first edition held in Granada in 1998, LREC has become one of the the major events on Language Resources (LRs) and Language Technologies (LT). At the 10th edition of the Language Resources and Evaluation Conference (LREC 2016), held from 23-28 May 2016 in Portorož (Slovenia), the AKSW/KILT members Bettina Klimek, Milan Doichinovski and Sebastian Hellmann took active participation. At the conference they presented their most recent research results and project outcomes in the areas of Linked Data and Language Technologies. With over 1250 paper submissions and 744 accepted papers, we are pleased to have contributed to the research field with the following contributions:

  • DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus, by Brümmer, Martin; Dojchinovski, Milan and Hellmann, Sebastian [PDF]
  • FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies, by Dojchinovski, Milan; Sasaki, Felix; Gornostaja,Tatjana;  Hellmann, Sebastian; Mannens, Erik; Salliau, Frank; Osella, Michele; Ritchie, Phil; Stoitsis, Giannis; Koidl, Kevin; Ackermann, Markus and Chakraborty, Nilesh [PDF]
  • Creating Linked Data Morphological Language Resources with MMoOn – The Hebrew Morpheme Inventory, by Klimek, Bettina and Arndt, Natanael and Krause, Sebastian and Arndt, Timotheus [PDF]
  • The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud, by McCrae, John P.; Chiarcos, Christian; Bond, Francis; Cimiano, Philipp; Declerck, Thierry; de Melo, Gerard; Gracia, Jorge; Hellmann, Sebastian; Klimek, Bettina; Moran, Steven; Osenova, Petya; Pareja-Lora, Antonio and Pool, Jonathan [PDF]

At the  main conference Bettina Klimek gave an oral presentation of the Hebrew Morpheme Inventory that is based on the MMoOn project. The audience showed high interest in the data and the underlying MMoOn ontology including questions about possible applications such as creating MMoOn based lemmatizers.

Bettina Klimek presenting @LREC2016

Milan Dojchinovski @LREC 2016

 

Further, Milan Dojchinovski gave two poster presentations summarizing the latest results from the FREME project. He presented the “DBpedia Abstracts” – a large-scale, open, multilingual NLP training corpus. The presentation attracted huge interest from the audience which has shown particular interest in its use. Several requests on availability of the corpora in other languages (i.e. Welsh) have been also received.

Milan has also presented the latest developments within the FREME project and the framework itself. The presentation has been primarily focused on the technical aspects of the framework, its availability, active use a real-world scenarios and the future plans.

Also, being active members of the Open Knowledge Foundation’s Working Group on Open Data in Linguistics (OWLG),  Sebastian Hellmann and Bettina Klimek helped organizing the 5th Workshop on Linked Data in Linguistic (LDL-2016) which was one of the LREC conference workshops. Around 50 participants attended the workshop discussing topics dealing with managing, building and using linked language resources. In the workshop’s poster session Bettina Klimek introduced the MMoOn model for representing morphological language data to the various interested workshop attendants. In addition, Milan Dojchinovski also presented results from the FREME project which relate to the research presented at the LDL workshop and the Linked Data and Language Technologies community.

The LDL Workshop participants.

In continuation of OWLG organized events, the First Workshop on Knowledge Extraction and Knowledge Integration (KEKI 2016) will take place on the 17-18 October in conjunction with the 15th International Semantic Web Conference in Kobe (Japan). The topics of linguistic Linked Data creation and integration will be taken up in order to move the LLOD cloud to its next phase in which innovative applications will be developed overcoming the language barriers on the Web. Paper submission is still open until 1st of July!

During the main conference days 25-27 May, 2016, Milan Dojchinovski and Felix Sasaki (FREME project coordinator) have taken participation in the exhibition area with a booth dedicated to the FREME project. The ultimate goal of this participation was to meet people interested in understanding how the open framework deployed within the project may help in narrowing the gap between the actual business needs and the language and Linked Data technologies. For more on the FREME presence at LREC 2016 you can read here.

LREC has been a great event to meet the community, make new connections, discuss current research challenges, share ideas, and establish new collaborations. Having said that, we look forward to the next LREC conference, in two years from now!

Posted in Events | Comments Off on AKSW@LREC2016

AKSW Publishes Survey on Challenges of Question Answering in the Semantic Web

Semantic Web Journal Logo
We are happy to announce that our Survey on Challenges of Question Answering in the Semantic Web (Konrad  Höffner, Sebastian Walter, Edgard Marx, Ricardo Usbeck, Jens Lehmann and Axel Ngonga) has been accepted.

Abstract

Semantic Question Answering (SQA) removes two major access requirements to the Semantic Web: the mastery of a formal query language like SPARQL and knowledge of a specific vocabulary. Because of the complexity of natural language, SQA presents difficult challenges and many research opportunities. Instead of a shared effort, however, many essential components are redeveloped, which is an inefficient use of researcher’s time and resources. This survey analyzes 62 different SQA systems, which are systematically and manually selected using predefined inclusion and exclusion criteria, leading to 72 selected publications out of 1960 candidates. We  identify common challenges, structure solutions, and provide recommendations for future systems. This work is based on publications from the end of 2010 to July 2015 and is also compared to older but similar surveys.

Posted in Announcements, Papers | Comments Off on AKSW Publishes Survey on Challenges of Question Answering in the Semantic Web

AKSW Colloquium, 30.05.2016, PARIS: Probabilistic Alignment of Relations, Instances, and Schema

Mohamed Sherif

In the incoming colloquium, Mohamed Ahmed Sherif will present the paper “PARIS: Probabilistic Alignment of Relations, Instances, and Schema” from Suchanek et al., published in the proceedings of VLDB 2012 [PDF].

Abstract

One of the main challenges that the Semantic Web faces is the integration of a growing number of independently designed ontologies. In this work, we present PARIS, an approach for the automatic alignment of ontologies. PARIS aligns not only instances, but also relations and classes. Alignments at the instance level cross-fertilize with alignments at the schema level. Thereby, our system provides a truly holistic solution to the problem of ontology alignment. The heart of the approach is probabilistic, i.e., we measure degrees of matchings based on probability estimates. This allows PARIS to run without any parameter tuning. We demonstrate the efficiency of the algorithm and its precision through extensive experiments. In particular, we obtain a precision of around 90% in experiments with some of the world’s largest ontologies.

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Uncategorized | Comments Off on AKSW Colloquium, 30.05.2016, PARIS: Probabilistic Alignment of Relations, Instances, and Schema

AKSW Colloquium, 23.05.2016, Instance Matching and RDF Dataset Similarity

In the incoming colloquium, Mofeed Hassan will present the paper “Semi-supervised Instance Matching Using Boosted Classifiers” from Kejriwal et al., published in the proceedings of ESWC 2015 [PDF].

Abstract

Instance matching concerns identifying pairs of instances that refer to the same underlying entity. Current state-of-the-art instance matchers use machine learning methods. Supervised learning systems achieve good performance by training on significant amounts of manually labeled samples. To alleviate the labeling effort, this paper presents a minimally supervised instance matching approach that is able to deliver competitive performance using only 2% training data and little parameter tuning. As a first step, the classifier is trained in an ensemble setting using boosting. Iterative semi-supervised learning is used to improve the performance of the boosted classifier even further, by re-training it on the most confident samples labeled in the current iteration. Empirical evaluations on a suite of six publicly available benchmarks show that the proposed system outcompetes optimization-based minimally supervised approaches in 1-7 iterations. The system’s average F-Measure is shown to be within 2.5% of that of recent supervised systems that require more training samples for effective performance.

After that, Michael Röder will present his paper “Detecting Similar Linked Datasets Using Topic Modelling” that has been accepted by the upcoming ESWC 2016 [PDF].

Abstract

The Web of data is growing continuously with respect to both the size and number of the datasets published. Porting a dataset to five-star Linked Data however requires the publisher of this dataset to link it with the already available linked datasets. Given the size and growth of the Linked Data Cloud, the current mostly manual approach used for detecting relevant datasets for linking is obsolete. We study the use of topic modelling for dataset search experimentally and present TAPIOCA, a linked dataset search engine that provides data publishers with similar existing datasets automatically. Our search engine uses a novel approach for determining the topical similarity of datasets. This approach relies on probabilistic topic modelling to determine related datasets by relying solely on the metadata of datasets. We evaluate our approach on a manually created gold standard and with a user study. Our evaluation shows that our algorithm outperforms a set of comparable baseline algorithms including standard search engines significantly by 6% F1-score. Moreover, we show that it can be used on a large real world dataset with a comparable performance.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Uncategorized | Comments Off on AKSW Colloquium, 23.05.2016, Instance Matching and RDF Dataset Similarity

AKSW Colloquium, 09.05.2016: Hebrew MMoOn inventory, federated SPARQL query processing

In this week’s colloquium Bettina Klimek will give a practice talk of the paper ‘Creating Linked Data Morphological Language Resources with MMoOn – The Hebrew Morpheme Inventory‘, which she will present at the LREC conference 2016, 23-28 May 2016, Slovenia, Portorož.

Abstract

The development of standard models for describing general lexical resources has led to the emergence of numerous lexical datasets of various languages in the Semantic Web. However, there are no models that describe the domain of morphology in a similar manner. As a result, there are hardly any language resources of morphemic data available in RDF to date. This paper presents the creation of the Hebrew Morpheme Inventory from a manually compiled tabular dataset comprising around 52.000 entries. It is an ongoing effort of representing the lexemes, word-forms and morphologigal patterns together with their underlying relations based on the newly created Multilingual Morpheme Ontology (MMoOn). It will be shown how segmented Hebrew language data can be granularly described in a Linked Data format, thus, serving as an exemplary case for creating morpheme inventories of any inflectional language with MMoOn. The resulting dataset is described a) according to the structure of the underlying data format, b) with respect to the Hebrew language characteristic of building word-forms directly from roots, c) by exemplifying how inflectional information is realized and d) with regard to its enrichment with external links to sense resources.

As a second talk, Muhammad Saleem will present his thesis titled “Efficient Source Selection For SPARQL Endpoint Federation” . This thesis addresses two key areas of federated SPARQL query processing: (1) efficient source selection, and (2) comprehensive SPARQL benchmarks to test and ranked federated SPARQL engines as well as triple stores.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session. The colloquium will take place in room P701.

Posted in Colloquium, paper presentation, PHD thesis defense practise | Comments Off on AKSW Colloquium, 09.05.2016: Hebrew MMoOn inventory, federated SPARQL query processing