MLST/MLEE/AFLP databases

LaMDa is maintaining and curating several strain typing databases for Bacillus cereus group bacteria, on our own web server (, which is run by the Norwegian EMBnet node.

MultiLocus Sequence Typing (MLST) is a tool that is widely used for phylogenetic typing of bacteria. MLST is based on PCR amplification and sequencing of internal fragments of a number (usually 6 or 7) of essential or housekeeping genes spread around the bacterial chromosome. The genetic relatedness among isolates is then determined by comparison of the nucleotide sequence types. MLST is thus a method that is unambiguous and truly portable among laboratories. Since the initial development of this technique for Neisseria meningitidis in 1998 (Maiden et al. 1998, Proc. Natl. Acad. Sci. 95:3140-45), MLST schemes have been developed for the most important bacterial pathogens, including Streptococcus pneumoniæ, Streptococcus pyogenes, Hæmophilus influenzæ, Staphylococcus aureus, Campylobacter jejuni, Enterococcus fæcium, Escherichia coli, and Salmonella enterica, and schemes are being developed for many other species (see Maiden 2006, Annu. Rev. Microbiol. 60:561-588 for a recent review). These MLST schemes have been used successfully to explore the population structure of bacteria, to study the evolution of their virulence properties, and to identify antibiotic−resistant strains and epidemic clones.

Our group has designed and applied the first MLST scheme for phylogenetic analysis of B. cereus group bacteria (Helgason et al. 2004, Appl. Environ. Microbiol. 70:191-201), leading to the identification of several clonal lineages comprising strains isolated from clinical sources. Alternative schemes have been subsequently developed by several other groups (Ko et al. 2004, Infect. Immun. 72:5253-61; Priest et al. 2004, J. Bacteriol. 186:7959-70; Candelon et al. 2004, Microbiology 150:601-611; Sorokin et al. 2006, Appl. Environ. Microbiol. 72:1569-1578). Since 2004 MLST has been extensively used as the main typing method for analyzing the genetic relationships within the whole B. cereus group population, with more than 20 peer-reviewed publications to date.

However, all B. cereus group MLST schemes are based on different gene and isolate sets, which makes results difficult to compare. Therefore, we have designed, in collaboration with Dr. Alexei Sorokin, INRA, France, a combined and optimized scheme based on 3 genes from the Helgason et al. 2004 scheme, 3 genes from the Priest et al. 2004 scheme, and one gene from the Sorokin et al. 2006 scheme. This new scheme is described in Tourasse, Helgason et al. 2006, J. Appl. Microbiol. 101:579-593. To date, 230 B. cereus group isolates have been analyzed using the combined MLST scheme and the data are available in the TH database that we developed, and which is part of our typing databases website.

Furthermore, in order to provide the B. cereus group research community with a common MLST resource and means for building a comprehensive genetic analysis of the group, we have now developed a new integrated multi−scheme MLST database, SuperCAT, that compiles all MLST data from the 5 published schemes for the B. cereus group (Tourasse and Kolstø 2008, Nucleic Acids Res. 36[Database issue]:D461-D468). We used supertree techniques to combine the phylogenetic information from analysis of all MLST schemes and datasets, in order to produce an integrated view of the B. cereus group population. In particular, the strains with complete genome sequences (currently 85), for which all MLST loci are thus available, can be used to join the schemes by supertree analysis. The current dataset contains a total of 1430 isolates that have been typed by some or all of 26 gene fragments from 25 different genes.

We have recently extended the supertree approach to integrate MLST data with phylogenetic information from MultiLocus Enzyme Electrophoresis (MLEE) and Amplified Fragment Length Polymorphism (AFLP), two other, different, typing methods that have been used for large−scale population studies of the B. cereus group. The combined MLST, MLEE, and AFLP data have been incorporated into the new multi−datatype HyperCAT database containing data for 2262 isolates (Tourasse et al. 2010, DATABASE baq017; Tourasse et al. 2010, Food Microbiol. in press). The new HyperCAT database, the SuperCAT database, and the database specific to the optimized Tourasse, Helgason et al. 2006 scheme (TH Database) are all available on the web server set up locally at the University of Oslo, thanks to the facilities and support provided by the Norwegian EMBnet node and the HPC Supercomputing facilities at the University of Oslo.

Published Feb. 28, 2011 12:14 AM - Last modified Apr. 8, 2022 7:43 AM