Natural Language Query Formulation
In the LogID group at ifi, a “visual query formulation” tool has been developed, that allows constructing database queries by a combination of ontology navigation, faceted search, and graph manipulation. See the demo video below.
In this thesis, you will explore possibilities of enhancing the query construction possibilities by means of natural language processing technology. Users will be able to specify initial versions of queries using natural language. Queries can then be refined using e.g. a dialog system, or the existing query formulation tool.
The master thesis will be carried out as a part of a recent EU project, called Optique (Scalable End-user Access to Big Data), and will give you opportunity to interact with top researchers all over the Europe. It will also be co-supervised by ifi's LogID group and the Language Technology Group (LTG).
The underlying idea of Optique is to elicit end-users' information needs in the form of queries. The queries are based on end users' domain vocabulary, captured in an ontology. In addition, query formulation components can make use of
- actual data in the database (i.e. users don't want to pose queries that have no answers)
- previously formulated queries (i.e. users will ask similar queries again)
- hand tuning, specific to the application domain.
This is true both for the menu/navigation based query formulation component and a natural language based one.
Work to be performed for this thesis includes
- research on existing work on
- natural language (database) query formulation
- natural language processing based on ontologies and structured data
- design of a system to read natural language text describing an information need, and extracting possible (formal) queries as interpretations
- implementation of a prototypical system
Depending on the progress and interests of the candidate, the work might include
- the design of a dialog system to refine queries through a natural language dialog