- NLP Data Cleansing Based on Linguistic Ontology Constraints (Dimitris Kontokostas, Martin Brummer, Sebastian Hellmann, Jens Lehmann and Lazaros Ioannidis)
- Unsupervised Link Discovery Through Knowledge Base Repair (Axel-Cyrille Ngonga Ngomo, Mohamed Sherif and Klaus Lyko)
- conTEXT – Lightweight Text Analytics using Linked Data (Ali Khalili, Sören Auer and Axel-Cyrille Ngonga Ngomo)
- HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation (Muhammad Saleem and Axel-Cyrille Ngonga Ngomo)
- Hybrid Acquisition of Temporal Scopes for RDF Data (Anisa Rula, Matteo Palmonari, Axel-Cyrille Ngonga Ngomo, Daniel Gerber, Jens Lehmann and Lorenz Bühmann)
Archiv für die Kategorie 'Projects'
On Jan 21, 2014 at 15:00 CET we were hosting a webinar on crowdsourced, multilingual OpenCourseWare authoring with http://SlideWiki.org:
SlideWiki.org is a platform for OpenCourseWare authoring and publishing. Similar as Wikipedia allows the collaborative authoring of encyclopedic texts, GitHub of sourceode or OpenStreetMaps of maps, SlideWiki enables communities to create comprehensive open educational resources. SlideWiki is open-source software and all content in SlideWiki is Open Knowledge. In this hangout we want to introduce SlideWiki’s rich feature set and explain how SlideWiki can be used for educational projects and teaching.
We are happy to announce the preview release of conTEXT — a platform for lightweight text analytics using Linked Data.
conTEXT enables social Web users to semantically analyze text corpora (such as blogs, RSS/Atom feeds, Facebook, G+, Twitter or SlideWiki.org decks) and provides novel ways for browsing and visualizing the results.
The process of text analytics in conTEXT starts by collecting information from the web. conTEXT utilizes standard information access methods and protocols such as RSS/ATOM feeds, SPARQL endpoints and REST APIs as well as customized crawlers for WordPress and Blogger to build a corpus of information relevant for a certain user. The assembled text corpus is then processed by Natural Language Processing (NLP) services (currently FOX and DBpedia-Spotlight) which link unstructured information sources to the Linked Open Data cloud through DBpedia. The processed corpus is then further enriched by de-referencing the DBpedia URIs as well as matching with pre-defined natural-language patterns for DBpedia predicates (BOA patterns). The processed data can also be joined with other existing corpora in a text analytics mashup. The creation of analytics mashups requires dealing with the heterogeneity of different corpora as well as the heterogeneity of different NLP services utilized for annotation. conTEXT employs NIF (NLP Interchange Format) to deal with this heterogeneity. The processed, enriched and possibly mixed results are presented to users using different views for exploration and visualization of the data. Additionally, conTEXT provides an annotation refinement user interface based on the RDFa Content Editor (RDFaCE) to enable users to revise the annotated results. User-refined annotations are sent back to the NLP services as feedback for the purpose of learning in the system.
- Online demo: http://context.aksw.org
- Screencast: http://youtu.be/EiGdkDRu_Ew
- Some examples of analyzed corpora: CNN, BBC, AKSW, LOD2 blogs or tweets from Bill Gates, Barack Obama, Ali Khalili or Sören Auer
- Publication (under review): Ali Khalili, Sören Auer, Axel-Cyrille Ngonga Ngomo: conTEXT – Lightweight Text Analytics using Linked Data
- There will be a Webinar to introduce the main features of conTEXT on Thursday, January 30, 2014, 4:00 PM – 5:00 PM CET. For further information, visit https://www4.gotomeeting.com/register/334511455
From November 25 to 27 the members of the ERM project (electronic resource management for libraries) attended the SWIB13 (Semantic Web in Libraries) in Hamburg. We, that is Andreas, Natanael, Norman and Thomas, went to Hamburg where the conference was held. We were joined by our colleagues Lydia, Björn and Leander from the University Library. We started Monday morning by visiting the DINI-Workgroup meeting where we could get in touch with Carsten Klee, the maintainer of the holding ontology. In the evening some of us attended an introduction to Linked Data for beginners and some a PhD workshop.
Tuesday was the first day of the main conference. We listened to an inspiring key note by Dorothea Salo which focused mainly on how Linked Data and its technology should be more focused on integrating new people rather than overly emphasize technical aspects. After all, the Semantic Web should be for people not computers since, as she said, „data without people is just noise“. This set the overall tenor for the conference, as it helped to sharpen the point that shined through in many other talks of the conference: While we are skilled in producing Linked Data, we sometimes shift too much into the direction of doing Linked Data just for the sake of doing it, but without actual use-cases or applications that understand Linked Data to generate a benefit for librarians. This idea was also strengthened in Martin Malmstens talk, which started the second day. We should not deliver RDF as an optional view of some data hub but rather should see Linked Data as a first class citizen that we build our applications around.
Another question which recurred during the conference was: How can and should we reuse ontologies as opposed to inventing new ontologies? This is a problem that we are also facing in out work for the ERM project. Kai Eckert talked about the DM2E project (which focuses on digitizing manuscripts for Europeana) and about the underlying applications profile and property extensions (DM2E). We also had a brief discussions with Adrian Pohl, who recently blogged about his idea to express a machine readable applications profile with OAI-ORE (blog post). In the evening we could get some personal insights into library topics during the conference dinner in the Blockbräu pub at the Landungsbrücken.
On the last day we could present our projects in a lightning talk. Leander gave a general outline of the project and it’s goals while Natanael gave a brief introduction to OntoWiki and how we are using it in our project (slideshare).
In private talks we could explain further details about the OntoWiki and might get more people to join the OntoWiki community. We were invited by Stefanie Rühle who is the head of the DINI-KIM working group to give a presentation in April next year. We could also get Professor Magnus Pfeffer interested in OntoWiki who thinks about using it for his teachings. We explained some details about setting up OntoWiki and Virtuoso and he might get back to us for more information.
The conference gave us as computer scientists a good chance to get insights into the topics of interest in libraries information management. We liked being in Hamburg and would like to come back for SWIB2014 in Cologne.
Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work.
More specifically, we had esp. the site extension (e.g. asynchronous content publishing) , pingback (web pings can be time consuming), publish/subscribe (same here) data testing (e.g. our own Databugger) and other use-cases in mind.
The asynchronous jobs feature is now merged to the develop branch and will be published with the next regular release (both Erfurt and OntoWiki). To use it now, Christian from eccenca created a nice documentation document for extension developers.