Anomaly Detection in Knowledge Graphs Using Graph Neural Networks
In this thesis, you will investigate how modern structure-aware machine learning techniques can be applied to practical challenges usually approached using classic symbolic AI techniques. Specifically, you will develop novel algorithms based on graph neural networks (GNNs) for anomaly detection in knowledge graphs, and test them against existing approaches in synthetic and real-life settings.
What is this thesis going to cover?
Currently, there are two main AI paradigms. On the one hand, there is the classical symbolic AI, which takes a top-down approach: problems are formulated in a high-level human-readable way, and solved by methods such as logical reasoning and combinatorial optimisation, taking into account expert knowledge in structured machine-readable format. On the other hand, there is statistical, or sub-symbolic, AI, which has recently become extremely popular in practice. This one is bottom-up and data-driven: problems are formulated in terms of examples, and an AI tool numerically generalises the examples—that is, learns a model—and makes decisions using this model. However, it is now widely understood that each of these two paradigms has its own fundamental limitations; for example, logic-based methods often fall short in applications dealing with large volumes of unstructured and noisy data, while learning-based methods have difficulties with interpretability of learned models, as well as with dealing with structured input and expert knowledge. Thus, one of the main current AI research directions is to efficiently combine and integrate these paradigms so that the resulting hybrid neuro-symbolic systems may exploit their strengths while mitigating their weaknesses.
One of the recent bridges between symbolic and sub-symbolic paradigms is graph neural networks (GNNs, also known as graph convolutional networks and simply graph networks, see a survey). One the one hand, they are neural networks: they learn and manipulate vectors and matrices of numbers to predict the outcome. On the other hand, they work directly on graph-structured input, thus resembling classic AI methods, including Knowledge Representation and Semantic Technologies. In particular, GNNs are often applied to practical tasks on Knowledge Graphs, such as Knowledge Graph completion and ontology learning.
In this thesis, you will investigate applicability of GNNs to another important Semantic Web problem, namely anomaly detection in Knowledge Graphs (see a survey). For the context of this thesis, we define anomaly detection as follows: given a knowledge graph in RDF format, find the substructures that are rare and that differ significantly from the majority of the reference objects in the graph.
One of the main obstacles for anomaly detection is that it is computationally difficult, and hence standard symbolic AI algorithms fail to process Knowledge Graphs of real life size and complexity. Thus, the main hypothesis of this project is that GNNs-based algorithms can improve the performance of anomaly detection without sacrificing the quality that much. In your thesis you prove (or disprove) this hypothesis correct; in particular, you will
— read literature on relevant topics, such as GNNs, Knowledge Graphs, and anomaly detection;
— study existing algorithms for anomaly detection;
— learn and implement various GNN architectures using Python and PyTorch;
— investigate their applicability to anomaly detection in several knowledge graphs, both synthetic and real-life;
— compare your implementations between each other and with existing methods, both in speed and quality;
— Possibly: investigate how implicit knowledge in form of ontologies can make impact on the performance and comparison.
Why should you apply for this thesis?
This thesis offers getting experience in modern AI methods and technologies, both from Symbolic and Sub-Symbolic paradigms. In particular, you will apply a recent powerful neural network architecture to a practically important problem, in parallel learning a lot of up-to-date concepts, tools and infrastructure, such as Knowledge Graphs and PyTorch.
Who should apply?
We are looking for a highly motivated master student who is genuinely interested in getting a rounded understanding of and experience in existing modern AI techniques. General understanding of Semantic Technologies and Neural Networks is desirable.