Last week Saeedeh Shekarpour was invited to present her work at the IBM research center (Watson project, DeepQA) in New York. On Monday, December 16 at 1.30 pm in Room P-702 (Paulinum), Saeedeh Shekarpour will present SINA, a question answering system, which transforms user-supplied queries in natural language into conjunctive SPARQL queries over a set of interlinked data sources.
As always, Bachelor and Master students are able to get points for attendance and there is complimentary coffee and cake after the session.
For further reading, please refer to the slides and the publication Question Answering on Interlinked Data (BibTeX).
The SINA Question Answering System
The architectural choices underlying Linked Data have led to a compendium of data sources which contain both duplicated and fragmented information on a large number of domains. One way to enable non-experts users to access this data compendium is to provide keyword search frameworks that can capitalize on the inherent characteristics of Linked Data. The contribution of this work is as follows:
- A novel approach for determining the most suitable resources for a user-supplied query from different datasets (disambiguation). It employs a hidden Markov model, whose parameters were bootstrapped with different distribution functions.
- A novel method for constructing a federated formal queries using the disambiguated resources and leveraging the linking structure of the underlying datasets. This approach essentially relies on a combination of domain and range inference as well as a link traversal method for constructing a connected graph which ultimately renders a corresponding SPARQL query.