Bacterial Pathogen Genome Sequencing Projects

The genome center at the University of Wisconsin was established to sequence the genome of Escherichia coli K-12 strain MG1655, which has served for decades as a model organism for basic studies of biochemistry, physiology, genetics and biotechnology. When the sequencing of this genome was completed in 1997, we turned to a group of related bacterial pathogens, making use of the E. coli K-12 sequence to accelerate analysis of the new genomes. The pathogenic Enterobacteriaceae we have selected include members of E. coli human diarrheagenic and extraintestinal strains, Yersinia pestis, Shigella flexneri, and Salmonella Typhi.

Analysis of the enterohaemorrhagic E. coli O157:H7 EDL933 genome was the first of these to be completed, and serves as a model for our comparative studies. The results of genome sequencing support the widespread involvement of horizontal transfer in the evolution of the Enterobacteriaceae, leading to the presence of distinct islands of DNA in different lineages. Since virulence determinants, as well as "backbone" genes, are shared among pathogens, we expect this multigenomic approach to lead to characterizing a gene pool of virulence determinants - the "pathosphere."

Availability of sequenced strains

Most of the strains we are sequencing have been deposited with the American Type Culture Collection (ATCC); genomic DNA is also available from these strains:

Strain ATCC No. Genomic DNA
Escherichia coli K-12 MG1655 700926 700926D-5
Escherichia coli EDL933 700927 700927D-5
Escherichia coli CFT073 [WAM2267] 700928 700928D-5
Shigella flexneri 2457T 700930 700930D-5
Salmonella Typhi Ty2 700931 700931D-5

Current Status of Sequencing Projects

genome size
last updated
E. coli K-12 non-pathogen;
reference strain
4.6 Mb
Sept 23, 2020
E. coli O157:H7
haemorrhagic colitis, haemolytic uremic syndrome EDL933
5.4 Mb
Feb 13, 2001
uropathogenic E. coli
cystitis, pylonephritis CFT073
5.2 Mb
Dec 5, 2002
E. coli K1
septicemia, neonatal meningitis RS218
5.2 Mb
finishing (assembly in 3 contigs)
May 17, 2006
Shigella flexneri 2a dysentery


4.6 Mb
Apr 28, 2003
Salmonella Typhi typhoid fever Ty2
4.8 Mb
Mar 21, 2003
Yersinia pestis plague KIM
4.6 Mb
Aug 1, 2002
enterotoxigenic E. coli H10407 (ETEC)
enteropathogenic E. coli E2348/69 (EPEC)
enteroaggregative E. coli 042 (EAEC)
Note: due to lack of funding, we no longer plan to sequence these strains. However, the Sanger Institute has been funded by Beowulf Genomics to perform comparative sequencing of five Escherichia coli and Shigella strains, including EPEC E2348/69 and EAEC 042.

*Plasmids from several Enterobacteriaceae are also being sequenced as separate projects.

All sequence data for the genome sequencing projects was generated by the Genetics Sequencing and Genomics Services Center (GSGSC)


© 2002-2024 UW E. coli Genome Project