A machine learning approach for classification of prostate cancer from RNA data
Next generation sequencing of DNA and RNA is commonly performed in cancer research. The laboratory tools used for this purpose are fairly new, and they generate terabytes of data from individual biological samples. The computational analysis tools are still immature, and there are many unmet needs in good strategies for integrating data and results from DNA and RNA sequencing.
One particular disease which needs improved management, and which may gain from such technologies, is prostate cancer. This is the most common cancer in Norway, and yet, a large fraction of the patients lack a consensus on how they should be treated. Others have already attempted to add molecular data to existing classification tools for their aggressiveness (e.g. The Cancer Genome Atlas, Cell 2015). However, we do not expect these to provide any clinical value since we now know that samples are highly heterogeneous within the same prostate (publications from our research group: Løvf et al., Eur. Urol., 2018 and Carm et al., unpublished). Thus, we need to find intrinsically stable values for the classification of the patient and his cancer as such.
We have generated a dataset with whole transcriptome RNA-sequencing data from multiple tissue samples per prostate. These samples are taken from both cancerous and benign appearing tissues within the prostate. The aim of the study is to implement machine learning approaches to identify patterns of RNA expression which vary between patients, but that is stable between cancer and/or normal appearing tissues within the same prostates. Initially, the project will evaluate various methods such as decision trees, random forest, support vector machine and deep neural networks for which will be the most relevant to use. Secondly, and the main aim, is then to apply the strategy on the actual RNA data from prostate cancer and develop a molecular classification system for the prostate cancers. A successful project will help to determine whether a patient has aggressive or rather indolent disease, and thereby also to better guide how the patient should be treated and followed-up.
The candidate is expected to spend time at the Oslo University Hospital-Radiumhospitalet during the master project. Main MSc-supervisor will be Rolf Skotheim, with co-supervision from Bjarne Johannessen and Marthe Løvf. This master project is offered as a long master project (60 study points).