Uropathogenic Escherichia coli (UPEC)

Uropathogenic strains of Escherichia coli (UPEC) are the most common cause of non-hospital-acquired urinary tract infections, responsible for 70-90% of the 7 million cases of acute cystitis and 250,000 cases of pyelonephritis reported annually in the United States. We have completed the genome sequence of the highly virulent UPEC strain CFT073, isolated from the blood of a woman with acute pyelonephritis.

Whole-genome shotgun libraries were constructed in M13Janus and pBluescript and sequenced by the method of Sanger on ABI 377 and 3700 instruments, collecting over 88,800 reads. The completed genome (5,231,428 bp) has been annotated and deposited in the public databases (accession number AE014075). Unlike many E. coli strains, CFT073 harbors no plasmids.


  • R. A. Welch, V. Burland, G. Plunkett III, P. Redford, P. Roesch, D. Rasko, E. L. Buckles, S. -R. Liou, A. Boutin, J. Hackett, D. Stroud, G. F. Mayhew, D. J. Rose, S. Zhou, D. C. Schwartz, N. T. Perna, H. L. T. Mobley, M. S. Donnenberg, and F. R. Blattner (2002) Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc. Natl. Acad. Sci. USA 99(26):17020-17024. [PubMed] [PDF] [abstract]

Strain availability

The UPEC strain CFT073 [WAM2267] has been deposited with the American Type Culture Collection (ATCC) and is available as ATCC 700928, both as a culture and as purified DNA from this strain.

Sequence availability

We have placed the annotated genome sequence on our web server (see below). The sequence has been processed by NCBI and entered into GenBank as 18 "pieces" (accession numbers AE016755 - AE016772), accessible via Entrez and BLAST. The complete genome as a single entry is also available via the NCBI ftp site.

  • Annotated sequence in GenBank flatfile format: AE014075.gbk
  • Sequence in fasta format: AE014075.fas
  • Sequences of annotated genes (CDS, rRNA, tRNA, misc_RNA) in fasta format: AE014075.fna
  • Sequences of individual proteins in fasta format: AE014075.faa

Table of E. coli CFT073 features

A table of the annotated genes and pseudogenes in the current E. coli CFT073 sequence has been prepared. The table is a tab-delimited text file in a Zip archive: AE014075 table


This project is part of our Bacterial Pathogens Genome Initiative, funded by NIAID.

back to top


© 2002-2024 UW E. coli Genome Project