Algorithms for rediscovery of cancer mutations in DNA sequencing data from blood plasma
Next generation sequencing of DNA and RNA is commonly performed in cancer research. The laboratory tools used for this purpose are fairly new, and they generate terabytes of data from individual biological samples. The computational analysis tools are still immature, and there are many unmet needs in good strategies for integrating data and results from DNA and RNA sequencing.
Objective and challenges
This master project will start out by analysing sequencing data from free floating DNA isolated from blood plasma of patients who are at risk of relapse from cancer. There are several variables in the sequencing data that are interesting to search for, including individual base changes and relative DNA copy numbers (variation of sequencing coverage) along segments of the genome. Here, one challenge is the low signal to noise ratio (cancer DNA is hidden within large amount of DNA from healthy cells). Researchers in the group are already experienced in the analysis of such data, but mainly with origins from tissue samples. Results from such tissue analyses will be used as positive controls in the development of useful algorithms for the analysis of plasma derived cfDNA sequencing data. The master project will be to develop relevant algorithms (possibly stand-alone software) for such DNA sequencing research.
Main supervisor will be Rolf Skotheim and co-supervisors Bjarne Johannessen, Andreas M. Hoff, and Torbjørn Rognes. The work will include time at the Oslo University Hospital, Radium Hospital. No prior knowledge of biology is required. The programming will be performed in the languages Perl/Python/Java and R (previous knowledge in at least some of these languages is desirable but not required) under Unix environment, which are widely used in bioinformatics.
This master project is offered as a long master project (60 study points).