University of Leipzig AKSW Homepage | Blog |

ALIGNED project kick-off

March 3, 2015 - 12:43 pm by MartinBruemmer - No comments »

ALIGNED, AKSW’s new H2020-funded project, kicked off in Dublin. The project brings together computer science researchers, companies building data-intensive systems and information technology, and academic curators of large datasets in an effort to build IT systems for aligned, co-evolving software and data lifecycles. These lifecycles will support automated testing, runtime data quality analytics, model-generated extraction and human curation interfaces.

AKSW will lead the data quality engineering part of ALIGNED, controlling the data lifecycle and providing integrity and verification techniques, using state-of-the-art tools such as RDFUnit and upcoming standards like  W3C Data Shapes. In this project, we will support our partners at Trinity College Dublin and Oxford Software Engineering as technical partners, Oxford Anthropology and Adam Mickiewicz University Poznan as data curators and publishers, as well as the Semantic Web Company and Wolters Kluwer, Germany providing enterprise solutions and use cases.

Find out more at aligned-project.eu and following @AlignedProject on Twitter.

Martin Brümmer on behalf of the NLP2RDF group

Aligned project kick-off team picture

AKSW Colloquium: Tommaso Soru and Martin Brümmer on Monday, March 2 at 3.00 p.m.

February 27, 2015 - 1:57 pm by AmrapaliZaveri - No comments »

On Monday, 2nd of March 2015, Tommaso Soru will present ROCKER, a refinement operator approach for key discovery. Martin Brümmer will then present NIF annotation and provenance – A comparison of approaches.

Tommaso Soru – ROCKER – Abstract

As within the typical entity-relationship model, unique and composite keys are of central importance also when their concept is applied on the Linked Data paradigm. They can provide help in manifold areas, such as entity search, question answering, data integration and link discovery. However, the current state of the art does not count approaches able to scale while relying on a correct definition of key. We thus present a refinement-operator-based approach dubbed ROCKER, which has shown to scale to big datasets with respect to the run time and the memory consumption. ROCKER will be officially introduced at the 24th International Conference on World Wide Web.

Tommaso Soru, Edgard Marx, and Axel-Cyrille Ngonga Ngomo, “ROCKER – A Refinement Operator for Key Discovery”. [PDF]

Martin Brümmer - Abstract – NIF annotation and provenance – A comparison of approaches

The uptaking use of the NLP Interchange Format (NIF) reveals its shortcomings on a number of levels. One of these is tracking metadata of annotations represented in NIF – which NLP tool added which annotation with what confidence at which point in time etc.

A number of solutions to this task of annotating annotations expressed as RDF statements has been proposed over the years. The talk will weigh these solutions, namely annotation resources, reification, Open Annotation, quads and singleton properties in regard to their granularity, ease of implementation and query complexity.

The goal of the talk is presenting and comparing viable alternatives of solving the problem at hand and collecting feedback on how to proceed.

AKSW Colloquium: Edgard Marx and Tommaso Soru on Monday, February 23, 3.00 p.m.

February 19, 2015 - 10:53 pm by TommasoSoru - No comments »

On Monday, 23rd of February 2015, Edgard Marx will introduce Smart, a search engine designed over the Semantic Search paradigm; subsequently, Tommaso Soru will present ROCKER, a refinement operator approach for key discovery.

EDIT: Tommaso Soru’s presentation was moved to March 2nd.

Abstract – Smart

Since the conception of the Web, search engines play a key role in making content available. However, retrieving of the desire information is still significantly challenging. Semantic Search systems are a natural evolution of the traditional search engines. They promise more accurate interpretation by understanding the contextual meaning of the user query. In this talk, we will introduce our audience to Smart, a search engine designed over the Semantic Search paradigm. Smart incorporates two of our currently designed approaches of dealing with the problem of Information Retrieval, as well as a novel interface paradigm. Moreover, we will present some of the former, as well as more recent state-of-the-art approaches used by the industry – for instance by Yahoo!, Google and Facebook.

Abstract – ROCKER

As within the typical entity-relationship model, unique and composite keys are of central importance also when their concept is applied on the Linked Data paradigm. They can provide help in manifold areas, such as entity search, question answering, data integration and link discovery. However, the current state of the art does not count approaches able to scale while relying on a correct definition of key. We thus present a refinement-operator-based approach dubbed ROCKER, which has shown to scale to big datasets with respect to the run time and the memory consumption. ROCKER will be officially introduced at the 24th International Conference on World Wide Web.

Tommaso Soru, Edgard Marx, and Axel-Cyrille Ngonga Ngomo, “ROCKER – A Refinement Operator for Key Discovery”. [PDF]

Call for Feedback on LIDER Roadmap

February 17, 2015 - 3:38 pm by AmrapaliZaveri - No comments »

The LIDER project is gathering feedback on a roadmap for the use of Linguistic Linked Data for content analytics.  We invite you to give feedback in the following ways:

Excerpt from the roadmap

Full document: available here
Summary slides: available here

Content is growing at an impressive, exponential rate. Exabytes of new data are created every single day. In fact, data has been recently referred to as the “oil” of the new economy, where the new economy is understood as “a new way of organizing and managing economic activity based on the new opportunities that the Internet provided for businesses” .

Content analytics, i.e. the ability to process and generate insights from existing content, plays and will continue to play a crucial role for enterprises and organizations that seek to generate value from data, e.g. in order to inform decision and policy making.

As corroborated by many analysts, substantial investments in technology, partnerships and research are required to reach an ecosystem consisting of many players and technological solutions that provide the necessary infrastructure, expertise and human resources required to make sure that organizations can effectively deploy content analytics solutions at large scale in order to generate relevant insights that support policy and decision making, or even to define completely new business models in a data-driven economy.

Assuming that such investments need to be and will be made, this roadmap explores the role that linked data and semantic technologies can and will play in the field of content analytics and will generate a set of recommendations for organizations, funders and researchers on which technologies to invest as a basis to prioritize their investment in R&D as well as on optimizing their mid- and long-term strategies and roadmaps.

Conference Call on 19th of February 3 p.m. CET

Connection details: https://www.w3.org/community/ld4lt/wiki/Main_Page#LD4LT_calls
Summary slides: available here

Agenda

  1. Introduction to the LIDER Roadmap (Philipp Cimiano, 10 minutes)
  2. Discussion of Global Customer Engagement Use Cases (All, 10 minutes)
  3. Discussion of Public Sector and Civil Society Use Cases (All, 10 minutes)
  4. Discussion of Linked Data Life Cycle and Linguistic Linked Data Value Chain (All, 10 minutes)
  5. General Discussion on further use cases, items in the roadmap etc. (20 minutes)

In addition, the call will briefly discuss progress of meta-share linked data metadata model.

The call is open to the public, no LD4LT group participation is required. Dial-in information is available. Please spread this information widely. No knowledge about linguistic linked data is required. We especially are interested in feedback from potential users of linguistic linked data.

About the LIDER Project

Website: http://lider-project.eu

The project’s mission is to provide the basis for the creation of a Linguistic Linked Data cloud that can support content analytics tasks of unstructured multilingual cross-media content. By achieving this goal, LIDER will impact on the ease and efficiency with which Linguistic Linked Data will be exploited in content analytics processes.

We aim at providing an ecosystem for the establishment of a new Linked Open Data (LOD) based ecosystem of free, interlinked, and semantically interoperable language resources (corpora, dictionaries, lexical and syntactic metadata, etc.) and media resources (image, video, etc. metadata) that will allow for free and open exploitation of such resources in multilingual, cross-media content analytics across the EU and beyond, with specific use cases in industries related to social media, financial services, localization, and other multimedia content providers and consumers.

Take a personal interview to include your voice into the roadmap

Contact: http://lider-project.eu/?q=content/contact-us

The EU project LIDER has been tasked by the European Commission to put together a roadmap for future R&D funding in multilingual industries such as content and knowledge localization, multilingual terminology and taxonomy management, cross-border business intelligence, etc. As a leading supplier of solutions in one or more of these industries, we would need your input for this roadmap. We would like to conduct a short interview with you to establish your views on current and developing R&D efforts in multilingual and semantic technologies that will likely play an increasing role in these industries, such as Linked Data and related standards for web-based, multilingual data processing. The interview will cover the below 5 questions and will not take more than 30 minutes. Please let us know on a suitable time and date.

AKSW Colloquium: Konrad Höffner and Michael Röder on Monday, February 16, 3.00 p.m.

February 16, 2015 - 1:45 pm by KonradHoeffner - No comments »

CubeQA—Question Answering on Statistical Linked Data by Konrad Höffner

Abstract

Question answering systems provide intuitive access to data by translating natural language queries into SPARQL, which is the native query language of RDF knowledge bases. Statistical data, however, is structurally very different from other data and cannot be queried using existing approaches. Building upon a question corpus established in previous work, we created a benchmark for evaluating questions on statistical Linked Data in order to evaluate statistical question answering algorithms and to stimulate further research. Furthermore, we designed a question answering algorithm for statistical data, which covers a wide range of question types. To our knowledge, this is the first question answering approach for statistical RDF data and could open up a new research area.
See also the paper (preprint, under review) and the slides.

News from the WSDM 2015 by Michael Röder

Abstract

The WSDM conference is one of the major conferences for Web Search and Data Mining. Michael Röder was attending this years WSDM conference in Shanghai and wants to present a short overview over the conference topics. After that, he wants to take a closer look at FEL – an entity linking approach for search queries peresented at the conference.

About the AKSW Colloquium

This event is part of a series of events about Semantic Web technology. Please see http://wiki.aksw.org/Colloquium for further information about previous and future events. As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.