Searching Very Large Collections of Semantic Graphs
The WikiWoods project makes available logical-form meaning representations for (almost) the full English Wikipedia, i.e. about 900 million statements. To make use of this repository of formal knowledge, efficient and scalable indexing and search technologies are needed that can take advantage of the graph-shaped structure of the data. Furthermore, for ease of access by non-experts, it is desirable to design and implement a custom, semi-formal query language, possibly a suitable subset of a natural language, that is mapped to actual queries (in a formal query language language like SPARQL) system-internally.
This project will investigate ways of indexing semantic networks to allow rich, flexible querying methods, with a starting point in Semantic Web technologies like RDF triple stores and the SPARQL query language. Scalability will be a major focus of the project, and so experience with handling large amounts of data would be beneficial. Details and further specification of the project can be discussed with Stephan Oepen and Milen Kouylekov.