Genetic distances - methods for comparing whole genome sequences
Infections are today returning as a global hazard, with international travel enabling infections to spread much wider and faster than before. In addition, the bacteria causing these infections are to an increasing extent resistant to the antibiotics with which we would combat them. Examples of such infections are methicillin-resistant Staphylococcus aureus (MRSA), which can give pneumonia and sepsis, E. coli, which can cause urinary infections and gastroenteritis, and Campylobacter jejuni, which also gives gastroenteritis.
One way of combating these bacteria is to to trace them to their source to curb their spread. This is complicated by the fact that many of these bacteria do not have only humans as their natural habitat. C. jejuni can be found in several animals such as chicken and pigs, as well as in water, and E. coli can be found in the gut of many animals and in water. This means that there can be many sources for an infection. The source of an infection is usually traced by figuring out which strain the bacteria is, and comparing that to other known sources of infections, from humans, other animals and the environment alike. With the advent of DNA sequencing we now have many new methods for figuring out which strain the bacteria is and many new ways of comparing their genomes to each other. However, no commonly accepted “best way” for comparison has been established.
This project aims to do a comparative study of a set of methods for calculating genetic distances between genomes. Among these these will be different versions of a core gene set, including Multi Locus Sequence Typing, as well as several different ways of establishing and calculating distances based on small insertions and deletions, and Single Nucleotide Polymorphisms (SNPs). Several data sets will be used, to see if different measures are more appropriate for different kinds of bacteria.
Students applying for this project should know or be willing to learn the following: working in a unix environment, basic programming in python, bioinformatics methods such as multiple alignments, phylogenetic methods, bacteriology and genomics.
This project is part of the StrainTracer project, which aims to build a platform that will allow non-bioinformatics people to use genomic data to help trace infections.
Main supervisor: Karin Lagesen - firstname.lastname@example.org