immuneML: a domain-tailored machine learning ecosystem to decipher adaptive immunity
“We have developed a large software framework that provides a broad variety of machine learning modules for discovering patterns in immune data. It implements each step of the machine learning process in an extensible, open-source software ecosystem that is based on fully specified and shareable workflows” says Geir Kjetil Sandve, Professor at the Centre for Bioinformatics.
The study was recently made available on BioRxiv preprint server and can be found here.
The study has been a large team effort with contributions from national and international collaborators including the group of Victor Greiff, Associate professor at the Department of Immunology.
The immune system remembers successful battles of the past to mount more efficient immune responses upon meeting the pathogen a second time. Just like our brain remembers many things from our childhood up to our present age, our immune system keeps a record of every infection it has encountered over the lifetime in the most universal language of all - the DNA.
Over the past 10 years, it has become possible to read the DNA of immune cells (called immune receptor sequencing). But reading the DNA of immune cells does not necessarily mean we understand how the DNA encodes immune memory.
Recently, machine learning have shown incredible success in identifying hidden patterns in very large sets of biological data. The hope is that it may also provide us with a rosetta stone for immunology, helping us translate the DNA sequences of immune responses into a map of pathogen encounters, auto-immune processes and cancer. Even a part success in this endeavor would provide the grounds for a revolution in disease diagnostics and therapeutics. However, to date, widespread adoption of machine learning on immune data has been inhibited by a lack of reproducibility, transparency, and interoperability of proposed approaches.
“immuneML is available as a command-line tool and through an intuitive Galaxy web interface, and extensive documentation of workflows is provided. We demonstrate the broad applicability of immuneML by (i) reproducing a large-scale study on immune state prediction, (ii) developing, integrating, and applying a novel method for antigen specificity prediction, and (iii) showcasing streamlined interpretability-focused benchmarking of machine learning analyses on immune data” says Milena Pavlovic and Lonneke Scheffer, Ph.D students at the Centre for Bioinformatics, and the lead authors of this study.