DBpedia, a community project affiliated with the Institute for Applied Informatics (InfAI) e.V., extract structured information from Wikipedia & Wikidata. Now DBpedia started the DBpedia Open Text Extraction Challenge – TextExt. The aim is to increase the number of structured DBpedia/Wikipedia data and to provide a platform for benchmarking various extraction tools. DBpedia wants to polish the knowledge of Wikipedia and then spread it on the web, free and open for any IT users and businesses.
Procedure
Compared to other challenges, which are often just one time calls, the TextExt is a continuous challenge focusing on lasting progress and exceeding limits in a systematic way. DBpedia provides the extracted and cleaned full text for all Wikipedia articles from 9 different languages in regular intervals for download and as Docker in the machine readable NIF-RDF format (Example for Barrack Obama in English). Challenge participants are asked to wrap their NLP and extraction engines in Docker images and submit them to the DBpedia-Team. They will run participants’ tools in regular intervals in order to extract:
- Facts, relations, events, terminology, ontologies as RDF triples (Triple track)
- Useful NLP annotations such as pos-tags, dependencies, co-reference (Annotation track)
DBpedia allows submissions 2 months prior to selected conferences (currently http://ldk2017.org/ and http://2017.semantics.cc/ ). Participants that fulfil the technical requirements and provide a sufficient description will be able to present at the conference and be included in the yearly proceedings. Each conference, the challenge committee will select a winner among challenge participants, which will receive 1.000 €.
Results
Starting in December 2017, DBpedia will publish a summary article and proceedings of participants’ submissions at http://ceur-ws.org/ every year.
For further news and next events please have a look at http://wiki.dbpedia.org/textext or contact DBpedia via e–mail dbpedia-textext-challenge@infai.org.
The project was created with the support of the H2020 EU project HOBBIT (GA-688227) and ALIGNED (GA-644055) as well as the BMWi project Smart Data Web (GA-01MD15010B)
Challenge Committee
- Sebastian Hellmann, AKSW, DBpedia Association, KILT Competence Center, InfAI e.V., Leipzig
- Sören Auer, Fraunhofer IAIS, University of Bonn
- Ricardo Usbeck, AKSW, Simba Competence Center, Leipzig University
- Dimitris Kontokostas, AKSW, DBpedia Association, KILT Competence Center, InfAI e.V., Leipzig
- Sandro Coelho, AKSW, DBpedia Association, KILT Competence Center, InfAI e.V., Leipzig