Updating Escherichia coli K-12

    Both the sequence and annotations for Escherichia coli K-12 strain MG1655 have been updated and deposited in GenBank (accession no. U00096.2). A copy of the GenBank flatfile is available for download from our server (U00096.2.gbk), as is a fasta file of the updated sequence (U00096.2.fas) and an Excel spreadsheet which summarizes the MG1655 update in terms of nucleotide sequence corrections and the consequent protein sequence changes.
    In addition, this updated version (designated version m56) has been made public via the ASAP database. As a public release, no registration or login is required; simply go to ASAP and click on [Enter ASAP].

The following letter appeared in the January 2004 issue of ASM News (reproduced with permission). Please watch this space for further developments on this project.

Workshop on Annotation of Escherichia coli K-12

    The readers of ASM News may be interested in knowing of a community effort recently initiated for the community good. The Workshop on Annotation of Escherichia coli K-12 2003 was held at Woods Hole, Mass., on 14-18 November. Fifteen scientists from Japan, Europe, and the United States came together at their own expense to participate in coordinating their work related to the E. coli genome as a whole: Martha Arnaud, SRI International, Menlo Park, Calif.; Mary Berlyn, Yale University, New Haven, Conn.; Fred Blattner, University of Wisconsin, Madison; Michael Galperin, NCBI, NIH, Bethesda, Md.; Jeremy Glasner, University of Wisconsin, Madison; Takashi Horiuchi, National Institute of Basic Biology, Japan; Takehide Kosuge, DNA DataBank of Japan; Hirotada Mori, Nara Institute of Science & Technology, Japan; Nicole Perna, University of Wisconsin, Madison; Guy Plunkett III, University of Wisconsin, Madison; Monica Riley, Marine Biological Laboratory, Woods Hole, Mass.; Kenneth E. Rudd, University of Miami Medical School, Miami, Fla.; Gretta Serres, Marine Biological Laboratory, Woods Hole, Mass.; Gavin Thomas, York University, York, United Kingdom; and Barry Wanner, Purdue University, Lafayette, Ind. These were scientists who have for many years been assembling knowledge on the DNA sequences, the gene boundaries, or the identity of the gene products of the genome of E. coli K-12. The object of coordinating the separate but parallel data collections, much of it as yet unpublished, was to benefit not only the community of scientists focused on E. coli itself, but beyond that to benefit the genomics community as a whole. Much of the attempt to understand the biology of genomic sequences of all organisms depends on a strong foundation of the best information from model organisms. E. coli has served for decades as a model organism for microbiology, including studies on cell structure, metabolic biochemistry and microbial genetics.
    Although many participants were interested in all phases of annotation of E. coli genes, in the interests of efficiency we created an artificial separation of working groups into two areas. One group addressed identifying known and predicted gene products; the other group worked on reconciling DNA sequences and establishing gene and pseudogene borders. The two groups were self-selecting: Arnaud, Galperin, Glasner, Perna, Riley, Serres and Thomas annotated gene products, while Horiuchi, Kosuge, Mori, Plunkett and Rudd focused on gene boundaries and sequences, with Rudd and Berlyn responsible for gene names and synonyms. Blattner and Wanner were observers who participated in discussions and global planning.
    Participants worked intensively for four and a half days. Their efforts produced pooled information that was agreed upon to be reliable for about 25% of E. coli genes, their sequences, gene borders, and the identification of gene products. They agreed to continue to work in coming weeks in small groups with close electronic communication, periodically sharing their results with all workshop participants for inspection and correction. The intensively reviewed and coordinated data will be submitted to GenBank in early 2004, and made available on the Internet for public access.

Monica Riley
Marine Biological Institute
Woods Hole, Mass.





back to top


© 2002-2017 UW E. coli Genome Project