AKSW Colloquium, 27-04-2015, Ontotext’s RDF database-as-a-service (DBaaS) via Self-Service Semantic Suite (S4) platform via & Knowledge-Based Trust

This colloquium features two talks. First the Self-Service Semantic Suite (S4) platform is presented by Marin Dimitrov (Ontotext), followed up by Jörg Unbehauens report on Googles effort on using factual correctness as a ranking factor.

RDF database-as-a-service (DBaaS) via Self-Service Semantic Suite (S4) platform

In this talk Marin Dimitrov (Ontotext) will introduce the RDFdatabase-as-a-service (DBaaS) options for managing RDF data in the Cloud via the Self-Service Semantic Suite (S4) platform. With S4 developers and researchers can instantly get access to fully managed RDF DBaaS, without the need for hardware provisioning, maintenance and operations. Additionally, the S4 platform provides on-demand access to text analytics services for news, social media and life sciences, as well as access to knowledge graphs (DBpedia, Freebase and GeoNames).

The goal of the S4 platform is to make it easy for developers and researchers to develop smart/semantic applications, without the need to spend time and effort on infrastructure provisioning and maintenance. Marin will also provide examples of EC funded research projects – DaPaaS, ProDataMarket and KConnect — that plan to utilise the S4 platform for semantic data management

More information on S4 will be available in [1][2] and [3]

Report on: “Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources”

by Xin Luna Dong, Evgeniy Gabrilovich, Kevin Murphy, Van Dang, Wilko Horn, Camillo Lugaresi, Shaohua Sun, Wei Zhang

Link to the paper

Presentation by Jörg Unbehauen


“The quality of web sources has been traditionally evaluated using exogenous signals such as the hyperlink structure of the graph. We propose a new approach that relies on endogenous signals, namely, the correctness of factual information provided by the source. A source that has few false facts is considered to be trustworthy. The facts are automatically extracted from each source by information extraction methods commonly used to construct knowledge bases. We propose a way to distinguish errors made in the extraction process from factual errors in the web source per se, by using joint inference in a novel multi-layer probabilistic model. We call the trustworthiness score we computed Knowledge-Based Trust (KBT). On synthetic data, we show that our method can reliably compute the true trustworthiness levels of the sources. We then apply it to a database of 2.8B facts extracted from the web, and thereby estimate the trustworthiness of 119M webpages. Manual evaluation of a subset of the results confirms the effectiveness of the method. “

