Easy Query Formulation for Complex Relationships
The challenge to be addressed is that many for many relationships between items in databases, there is a simple and a complicated story. For instance the simple story might be “Town A lies in country B.” The complicated story is “Town A lies in country B since 1990, according to the government of B, but only since 1995 according to the United Nations.” In a database containing the full information, queries will usually have to deal with the full details too.
The goal of this thesis is to extend and refine an existing visual user interface for query formulation in such a way that a) databases containing the full information can be queried, b) queries corresponding to the “simple story” can be constructed easily, without bothering users about the details, c) queries involving the details can be posed with a few clicks more.
The master thesis will be carried out as a part of a recent EU project, called Optique (Scalable End-user Access to Big Data), and will give you opportunity to interact with top researchers all over the Europe.
A tremendous amount of data is being generated every day both on the Web and in public and private organisations; and, by all accounts, in this increasingly data-oriented world, any individual or organisation, who posses the necessary knowledge, skills, and tools to make value out of data at such scales, bears a considerable advantage in terms of competitiveness and development. Particularly, in an enterprise setting, ability to access and use data in business processes such as sense-making and intelligence analysis is key for its value creation potential.
Today, however, data access still stands as a major bottleneck for many organisations. This is mostly due to the sharp distinction between employees who have technical skills and knowledge to extract data (i.e., database/IT experts, skilled users etc.) and those who have domain knowledge and know how to interpret and use data (i.e., domain experts, end-users etc.). The result is a workflow where domain-experts either have to use pre-defined queries embedded in applications or communicate their information needs to database-experts. In such a workflow, the turn-around time from users’ initial information needs to receiving the answer can be in the range of weeks, incurring significant costs.
Approaches that eliminate the man-in-the-middle and allow end-users to directly engage with data and extract it on their own, have been of interest to researchers for many years. As anticipated, for end-users, the accessibility of traditional structured query languages such as SQL and XQuery fall far short, since such textual languages do require end-users to have a set of technical skills and to recall domain concepts and the terminology and syntax of the language being used. For this very reason, visual query systems and languages have emerged to alleviate the end-user data access problem. A visual system or language follows the directmanipulation idea, where the domain and query language are represented with a set of visual elements.
There are basically two types of activities, namely exploration (understanding the reality of interest) and query construction, that have to be supported by a data access system. The goal of the former is to establish an understanding of the domain by means of finding and identifying domain constructs, such as concepts and relationships, and their organisation. The goal of the latter is to formally express the information need. Exploration and construction have adverse (i.e., breadth vs. depth), yet complementary roles; therefore, they have to be addressed and intertwined adequately.
In this thesis, your work will based on existing tool for visual query formulation, developed in the LogID group, and in the frame of the Optique project. You will extend and refine this tool in tight cooperation with its original developers, and with a focus on concrete applications in industrial case studies. You will
- Analyse different cases of “complex relationships” occurring in the project use cases
- develop a concept to characterise how “simple case” queries translate to queries over the actual data
- develop a concept for presenting complex relations in the user interface, in particular representing the relationship between the simplified view and the detailed view.
- implement these concepts in the query formulation tool