Enterohaemorrhagic Escherichia coli (EHEC) O157:H7

We have completed the genome sequence of the Escherichia coli O157:H7 strain EDL933, as described in the January 25, 2001 issue of Nature. The sequence has been processed by NCBI and entered into GenBank as 495 "pieces" (accession numbers AE005177 - AE005671), accessible via Entrez and BLAST. The complete genome is also available via the NCBI ftp site. In addition, we have placed the annotated genomic sequence on our web site (see below).

Please note that a typographical error in Nature resulted in the wrong accession number for the complete sequence: it should be AE005174, not AE00517H.

Feb 13, 2001 -- Gap filled. As noted in our paper, the initial version of the sequence had 2 gaps and a number of sequence ambiguities, which we continue to work on. The larger gap has now been filled, and an updated (version 2.0) sequence sent to NCBI. The sequences on our web server have also been updated.

The gap had been estimated to be 54 kbp, based on the optical map data, and the new sequence data spans 53,598 bp. It contains yet another cryptic lambdoid prophage, designated CP-933P, and a copy of the insertion sequence ISEc8. The majority of the sequence is homologous to other parts of the genome, which contributed to gaps in the assembly of the whole genome shotgun. The sequence was resolved by generating a separate shotgun from a DNA fragment spanning the gap. The remaining gap, between contig 1 and contig 2, is about 4 kbp; it is within a cryptic prophage region as well, and is being addressed in a similar fashion.

Please continue to watch this page for further updates; we will post them here as they become available, while awaiting processing by NCBI to generate updated GenBank entries.


Nicole T. Perna, Guy Plunkett III, Valerie Burland, Bob Mau, Jeremy D. Glasner, Debra J. Rose, George F. Mayhew, Peter S. Evans, Jason Gregor, Heather A. Kirkpatrick, György Pósfai, Jeremiah Hackett, Sara Klink, Adam Boutin, Ying Shao, Leslie Miller, Erik J. Grotbeck, N. Wayne Davis, Alex Lim, Eileen Dimalanta, Konstantinos D. Potamousis, Jennifer Apodaca, Thomas S. Anantharaman, Jieyi Lin, Galex Yen, David C. Schwartz, Rodney A. Welch, and Frederick R. Blattner (2001) Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature 409 (6819), 529-533. [Nature online]

Strain availability

The specific O157:H7 strain we sequenced has been deposited with the American Type Culture Collection (ATCC) and is available as ATCC 700927, both as a culture and as purified DNA from this strain.

Download Sequences from the UWGP web site

For ease of comparisons, we have linearized the genome at the same site as we chose for the E. coli K-12 strain MG1655 sequence. As noted above, there is a gap of about 4 kbp between contig 1 and contig 2; the end of contig 2 and the beginning of contig 1 overlap by 527 bp to complete the circular chromosome.

Download Table of EDL933 genes from UWGP web site

A table of all the annotated genes (ORFs and RNAs) in the current EDL933 sequence has been prepared. In addition to information derived from the gene annotations (e.g., location, name, product, function, etc.) a column labeled SegmentType indicates whether the gene is within a backbone region, an O-island, or at a junction between backbone and island. The table is a tab-delimited text file in a Zip archive: EDL933 table


Funding for this work included grants from the NIH (NIAID and NCHGR) and the University of Wisconsin Graduate School.

Press release from the University of Wisconsin-Madison

Selected links for additional information

