LIMES 1.0.0 Released

Dear all,

the LIMES Dev team is happy to announce LIMES 1.0.0.

LIMES, the Link Discovery Framework for Metric Spaces, is a link discovery framework for the Web of Data. It implements time-efficient approaches for large-scale link discovery based on the characteristics of metric spaces. Our approaches facilitate different approximation techniques to compute estimates of the similarity between instances. These estimates are then used to filter out a large amount of those instance pairs that do not suffice the mapping conditions. By these means, LIMES can reduce the number of comparisons needed during the mapping process by several orders of magnitude. The approaches implemented in LIMES include the original LIMES  algorithm for edit distances, HR3, HYPPO and ORCHID.

Additionally, LIMES supports the first planning technique for link discovery HELIOS, that minimizes the overall execution of a link specification, without any loss of completeness. Moreover, LIMES implements supervised and unsupervised machine-learning algorithms for finding accurate link specifications. The algorithms implemented here include the supervised, active and unsupervised versions of EAGLE and WOMBAT.

 

Website: http://aksw.org/Projects/LIMES.html

Download: https://github.com/AKSW/LIMES-dev/releases/tag/1.0.0

GitHub: https://github.com/AKSW/LIMES-dev

User manual: http://aksw.github.io/LIMES-dev/user_manual/

Developer manual: http://aksw.github.io/LIMES-dev/developer_manual/

 

What is new in LIMES 1.0.0:

  • New LIMES GUI
  • New Controller that supports manual and graphical configuration
  • New machine learning pipeline: supports supervised, unsupervised and active learning algorithms
  • New dynamic planning for efficient link discovery
  • Updated execution engine to handle dynamic planning
  • Added support for qualitative (Precision, Recall, F-measure etc.) and quantitative (runtime duration etc.) evaluation metrics for mapping evaluation, in the presence of a gold standard
  • Added support for configuration files in XML and RDF formats
  • Added support for pointsets metrics such as Mean, Hausdorff and Surjection
  • Added support for MongeElkan, RatcliffObershelp string measures
  • Added support for Allen’s algebra temporal relations for event data
  • Added support for all topological relations derived from the DE-9IM model
  • Migrated the codebase to Java 8 and Jena 3.0.1

We would like to thank everyone who helped to create this release. We also acknowledge the support of the SAKE  and HOBBIT projects.

Kind regards,

The LIMES Dev team

 

Posted in Uncategorized | Comments Off on LIMES 1.0.0 Released

DL-Learner 1.3 (Supervised Structured Machine Learning Framework) Released

Dear all,

the Smart Data Analytics group at AKSW is happy to announce DL-Learner 1.3.

DL-Learner is a framework containing algorithms for supervised machine learning in RDF and OWL. DL-Learner can use various RDF and OWL serialization formats as well as SPARQL endpoints as input, can connect to most popular OWL reasoners and is easily and flexibly configurable. It extends concepts of Inductive Logic Programming and Relational Learning to the Semantic Web in order to allow powerful data analysis.

Website: http://dl-learner.org
GitHub page: https://github.com/AKSW/DL-Learner
Download: https://github.com/AKSW/DL-Learner/releases
ChangeLog: http://dl-learner.org/development/changelog/

DL-Learner is used for data analysis tasks within other tools such as ORE and RDFUnit. Technically, it uses refinement operator based, pattern-based and evolutionary techniques for learning on structured data. For a practical example, see http://dl-learner.org/community/carcinogenesis/. It also offers a plugin for Protégé, which can give suggestions for axioms to add.

In the current release, we added a large number of new algorithms and features. For instance, DL-Learner supports terminological decision tree learning, it integrates the LEAP and EDGE systems as well as the BUNDLE probabilistic OWL reasoner. We migrated the system to Java 8, Jena 3, OWL API 4.2 and Spring 4.3. We want to point to some related efforts here:

We want to thank everyone who helped to create this release, in particular we want to thank Giuseppe Cota who visited the core developer team and significantly improved DL-Learner. We also acknowledge support by the recently SAKE project, in which DL-Learner will be applied to event analysis in manufacturing use cases, as well as Big Data Europe and HOBBIT projects.

Kind regards,

Lorenz Bühmann, Jens Lehmann, Patrick Westphal and Simon Bin

 

Posted in Uncategorized | Comments Off on DL-Learner 1.3 (Supervised Structured Machine Learning Framework) Released

OntoWiki 1.0.0 released

Dear Semantic Web and Linked Data Community,
we are proud to finally announce the releases of OntoWiki 1.0.0 and the underlying Erfurt Framework in version 1.8.0.
After 10 years of development we’ve decided to release the teenager OntoWiki from the cozy home of 0.x versions.
Since the last release of 0.9.11 in January 2014 we did a lot of testing to stabilize OntoWikis behavior and accordingly made a lot of bug fixes, also we are now using PHP Composer for dependency management, improved the testing work flow, gave a new structure and home to the documentation and we have created a neat project landing page.

The development of OntoWiki is completely open source and we are happy for any contribution, especially to the code and the documentation, which is also kept in a Git repository with easy to edit Markdown pages. If you have questions about the usage of OntoWiki besides the documentation you can also use or mailinglist or the stackoverflow tag “ontowiki”.

Please see https://ontowiki.net/ for further information.

We also had a Poster for advertising the OntoWiki release at SEMANTiCS Conference:

OntoWiki 1.0

Philipp Frischmuth, Natanael Arndt, Michael Martin: OntoWiki 1.0: 10 Years of Development – What’s New in OntoWiki

We are happy for your feedback, in the name of the OntoWiki team,
Philipp, Michael and Natanael

Our Fingers on the Mouse

Posted in Announcements, LEDS, major tool release, OntoWiki, Software Releases | Tagged , , , | Comments Off on OntoWiki 1.0.0 released

AKSW Colloquium, 05.09.2016. LOD Cloud Statistics, OpenAccess at Leipzig University.

On the upcoming Monday (05.09.2016), AKSW group will discuss topics related to Semantic Web and LOD Cloud Statistics. Also, we will have invited speaker from University of Leipzig Library (UBL) Dr. Astrid Vieler talking about OpenAccess at Leipzig University.

LODStats: The Data Web Census Dataset

by Ivan Ermilov et al.
Presented by: Ivan Ermilov

Abstract: Over the past years, the size of the Data Web has increased significantly, which makes obtaining general insights into its growth and structure both more challenging and more desirable. The lack of such insights hinders important data management tasks such as quality, privacy and coverage analysis. In this paper, we present the LODStats dataset, which provides a comprehensive picture of the current state of a significant part of the Data Web. LODStats is based on RDF datasets from data.gov, publicdata.eu and datahub.io data catalogs and at the time of writing lists over 9 000 RDF datasets. For each RDF dataset, LODStats collects comprehensive statistics and makes these available in adhering to the LDSO vocabulary. This analysis has been regularly published and enhanced over the past five years at the public platform lodstats.aksw.org. We give a comprehensive overview over the resulting dataset.

OpenAccess at Leipzig University

Invited talk by Dr. Astrid Vieler from Leipzig University Library (UBL). The talk will be about Open Access in general and the Open Access Policy of our University in special. She will tell us more about our right, which we have toward the publishers, and she gives us advice and hints on how we can increase the visibility of our publications.

After the talks, there is more time for discussion in smaller groups as well as coffee and cake. The colloquium starts at 3 p.m. and is located on 7th floor (Leipzig, Augustusplatz 10, Paulinum).

Posted in Announcements, Colloquium, Events, invited talk, paper presentation | Comments Off on AKSW Colloquium, 05.09.2016. LOD Cloud Statistics, OpenAccess at Leipzig University.

AKSW Colloquium, 15th August, 3pm, RDF query relaxation

Michael Roeder On the 15th of August at 3 PM, Michael Röder will present the paper “RDF Query Relaxation Strategies Based on Failure Causes” of Fokou et al. in P702.

Abstract

Recent advances in Web-information extraction have led to the creation of several large Knowledge Bases (KBs). Querying these KBs often results in empty answers that do not serve the users’ needs. Relaxation of the failing queries is one of the cooperative techniques used to retrieve alternative results. Most of the previous work on RDF query relaxation compute a set of relaxed queries and execute them in a similarity-based ranking order. Thus, these approaches relax an RDF query without knowing its failure causes (FCs). In this paper, we study the idea of identifying these FCs to speed up the query relaxation process. We propose three relaxation strategies based on various information levels about the FCs of the user query and of its relaxed queries as well. A set of experiments conducted on the LUBM benchmark show the impact of our proposal in comparison with a state-of-the-art algorithm.

The paper is available at researchgate.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Uncategorized | Comments Off on AKSW Colloquium, 15th August, 3pm, RDF query relaxation

Article accepted in Journal of Web Semantics

We are happy to announce that the article “DL-Learner – A Framework for Inductive Learning on the Semantic Web” by Lorenz Bühmann, Jens Lehmann and Patrick Westphal was accepted for publication in the Journal of Web Semantics: Science, Services and Agents on the World Wide Web.

Abstract:

In this system paper, we describe the DL-Learner framework, which supports supervised machine learning using OWL and RDF for background knowledge representation. It can be beneficial in various data and schema analysis tasks with applications in different standard machine learning scenarios, e.g. in the life sciences, as well as Semantic Web specific applications such as ontology learning and enrichment. Since its creation in 2007, it has become the main OWL and RDF-based software framework for supervised structured machine learning and includes several algorithm implementations, usage examples and has applications building on top of the framework. The article gives an overview of the framework with a focus on algorithms and use cases.

Posted in DL-Learner, Papers, Uncategorized | Comments Off on Article accepted in Journal of Web Semantics

AKSW Colloquium, 18.07.2016, AEGLE and node2vec

On Monday 18.07.2016, Kleanthi Georgala will give her Colloquium presentation for her paper “An Efficient Approach for the Generation of Allen Relations”, that was accepted at the European Conference on Artificial Intelligence (ECAI) 2016.

Abstract

Event data is increasingly being represented according to the Linked Data principles. The need for large-scale machine learning on data represented in this format has thus led to the need for efficient approaches to compute RDF links between resources based on their temporal properties. Time-efficient approaches for computing links between RDF resources have been developed over the last years. However, dedicated approaches for linking resources based on temporal relations have been paid little attention to. In this paper, we address this research gap by presenting AEGLE, a novel approach for the efficient computation of links between events according to Allen’s interval algebra. We study Allen’s relations and show that we can reduce all thirteen relations to eight simpler relations. We then present an efficient algorithm with a complexity of O(n log n) for computing these eight relations. Our evaluation of the runtime of our algorithms shows that we outperform the state of the art by up to 4 orders of magnitude while maintaining a precision and a recall of 1.

Tommaso SoruAfterwards, Tommaso Soru will present a paper considered the latest chapter of the Everything-2-Vec saga, which encompasses outstanding works such as Word2Vec and Doc2Vec. The paper title is node2vec: Scalable Feature Learning for Networks” [PDF] by Aditya Grover and Jure Leskovec, accepted for publication at the International Conference on Knowledge Discovery and Data Mining (KDD), 2016 edition.

Posted in Uncategorized | Comments Off on AKSW Colloquium, 18.07.2016, AEGLE and node2vec

AKSW Colloquium, 04.07.2016. Big Data, Code Quality.

On the upcoming Monday (04.07.2016), AKSW group will discuss topics related to Semantic Web and Big Data as well as programming languages and code quality. In particular, the following papers will be presented:

S2RDF: RDF Querying with SPARQL on Spark

by Alexander Schätzle et al.
Presented by: Ivan Ermilov

RDF has become very popular for semantic data publishing due to its flexible and universal graph-like data model. Yet, the ever-increasing size of RDF data collections makes it more and more infeasible to store and process them on a single machine, raising the need for distributed approaches. Instead of building a standalone but closed distributed RDF store, we endorse the usage of existing infrastructures for Big Data processing, e.g. Hadoop. However, SPARQL query performance is a major challenge as these platforms are not designed for RDF processing from ground. Thus, existing Hadoop-based approaches often favor certain query pattern shape while performance drops significantly for other shapes. In this paper, we describe a novel relational partitioning schema for RDF data called ExtVP that uses a semi-join based preprocessing, akin to the concept of Join Indices in relational databases, to efficiently minimize query input size regardless of its pattern shape and diameter. Our prototype system S2RDF is built on top of Spark and uses its relational interface to execute SPARQL queries over ExtVP. We demonstrate its superior performance in comparison to state
of the art SPARQL-on-Hadoop approaches using the recent WatDiv test suite. S2RDF achieves sub-second runtimes for majority of queries on a billion triples RDF graph

A Large Scale Study of Programming Languages and Code Quality in Github

by Baishakhi Ray et al.
Presented by: Tim Ermilov

What is the effect of programming languages on software quality? This question has been a topic of much debate for a very long time. In this study, we gather a very large data set from GitHub (729 projects, 80 Million SLOC, 29,000 authors, 1.5 million commits, in 17 languages) in an attempt to shed some empirical light on this question. This reasonably large sample size allows us to use a mixed-methods approach, combining multiple regression modeling with visualization and text analytics, to study the effect of language features such as static v.s. dynamic typing, strong v.s. weak typing on software quality. By triangulating findings from different methods,
and controlling for confounding effects such as team size, project size, and project history, we report that language design does have a significant, but modest effect on software quality. Most notably, it does appear that strong typing is modestly better than weak typing, and among functional languages, static typing is also somewhat better than dynamic typing. We also find that functional languages are somewhat better than procedural languages. It is worth noting that these modest effects arising from language design are overwhelmingly dominated by the process factors such as project size, team size, and commit size. However, we hasten to caution the reader that even these modest effects might quite possibly be due to other, intangible process factors, e.g., the preference of certain personality types for functional, static and strongly typed languages

Each paper will be presented in 20 minutes, which will be followed by 10 minutes discussion. After the talks, there is more time for discussion in smaller groups as well as coffee and cake. The colloquium starts at 3 p.m. and is located on 7th floor (Leipzig, Augustusplatz 10, Paulinum).

Posted in BigDataEurope, Colloquium, paper presentation | Comments Off on AKSW Colloquium, 04.07.2016. Big Data, Code Quality.

Accepted Papers of AKSW Members @ Semantics 2016

logo-semantics-16This year’s SEMANTiCS conference which is taking place between September 12 – 15, 2016 in Leipzig recently invited for the submission of research papers on semantic technologies. Several AKSW members seized the opportunity and got their submitted papers accepted for presentation at the conference.

These are listed below:

  • Executing SPARQL queries over Mapped Document Stores with SparqlMap-M (Jörg Unbehauen, Michael Martin )
  • Distributed Collaboration on RDF Datasets Using Git: Towards the Quit Store (Natanael Arndt, Norman Radtke and Michael Martin)
  • Towards Versioning of Arbitrary RDF Data (Marvin Frommhold, Ruben Navarro Piris, Natanael Arndt, Sebastian Tramp, Niklas Petersen and Michael Martin)
  • DBtrends: Exploring query logs for ranking RDF data (Edgard Marx, Amrapali Zaveri, Diego Moussallem and Sandro Rautenberg)
  • MEX Framework: Automating Machine Learning Metadata Generation (Diego Esteves, Pablo N. Mendes, Diego Moussallem, Julio Cesar Duarte, Maria Claudia Cavalcanti, Jens Lehmann, Ciro Baron Neto and Igor Costa)

logo-www.leds-projekt.deAnother AKSW-driven event of the SEMANTiCS 2016 will be the Linked Enterprise Data Services (LEDS) Track taking place between September 13-14, 2016. This track is specifically organized by the BMBF-funded LEDS project which is part of the Entrepreneurial Regions program – a BMBF Innovation Initiative for the New German Länder. Focus is on discussing with academic and industrial partners new approaches to discover and integrate background knowledge into business and governmental environments.

DBpediaLogoFullSEMANTiCS 2016 will also host the 7th edition of the DBpedia Community Meeting on the last day of the conference (September 15 – ‘DBpedia Day‘). DBpedia is a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and link the different data sets on the Web to Wikipedia data.

So come and join SEMANTiCS 2016, talk and discuss with us!

More information on the program can be found here.

LEDS is funded by:                      Part of:
BMBF_CMYK_Gef_L_300dpi

Wachstumskern Region

Posted in Announcements, Call for Paper, dbpedia, Events, LEDS, Papers, Uncategorized | Comments Off on Accepted Papers of AKSW Members @ Semantics 2016

AKSW Colloquium, 27.06.2016, When owl:sameAs isn’t the Same + Towards Versioning for Arbitrary RDF Data

In the next Colloquium, June the 27th at 3 PM, two papers will be presented:

When owl:sameAs isn’t the Same: An Analysis of Identity in Linked Data

andre_terno_itaAndré Valdestilhas will present the paper “When owl:sameAs isn’t the Same: An Analysis of Identity in Linked Data” by Halpin et al. [PDF]:

Abstract:  In Linked Data, the use of owl:sameAs is ubiquitous in interlinking data-sets. There is however, ongoing discussion about its use, and potential misuse, particularly with regards to interactions with inference. In fact, owl:sameAs can be viewed as encoding only one point on a scale of similarity, one that is often too strong for many of its current uses. We describe how referentially opaque contexts that do not allow inference exist, and then outline some varieties of referentially-opaque alternatives to owl:sameAs. Finally, we report on an empirical experiment over randomly selected owl:sameAs statements from the Web of data. This theoretical apparatus and experiment shed light upon how owl:sameAs is being used (and misused) on the Web of data.

Towards Versioning for Arbitrary RDF Data

marvin-frommhold-foto.256x256Afterwards, Marvin Frommhold will practice the presentation of his paper “Towards Versioning for Arbitrary RDF Data” (Marvin Frommhold, Rubén Navarro Piris, Natanael Arndt, Sebastian Tramp, Niklas Petersen, and Michael Martin) [PDF] which is accepted at the main conference of the Semantics 2016 in Leipzig.

Abstract: Coherent and consistent tracking of provenance data and in particular update history information is a crucial building block for any serious information system architecture. Version Control Systems can be a part of such an architecture enabling users to query and manipulate versioning information as well as content revisions. In this paper, we introduce an RDF versioning approach as a foundation for a full featured RDF Version Control System. We argue that such a system needs support for all concepts of the RDF specification including support for RDF datasets and blank nodes. Furthermore, we placed special emphasis on the protection against unperceived history manipulation by hashing the resulting patches. In addition to the conceptual analysis and an RDF vocabulary for representing versioning information, we present a mature implementation which captures versioning information for changes to arbitrary RDF datasets.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.

Posted in Colloquium, LEDS, LUCID, paper presentation, Papers | Comments Off on AKSW Colloquium, 27.06.2016, When owl:sameAs isn’t the Same + Towards Versioning for Arbitrary RDF Data