E. coli K-12 Paralog Table

The table contains a summary of the search results obtained by pairwise comparison of all predicted open reading frames (ORFs) from the E. coli K-12 genome. The sequence corresponding to each ORF was compared against a database of all 4290 ORFs using a Decypher (Time Logic, Inc.) search engine that performs Smith-Waterman global sequence alignments. For each ORF, the top ten highest scoring matches (that contained at least 200 bp with at least 50% identity) are reported in this table. To obtain a non-redundant list of only the best matches for each ORF simply sort on the Rank column and retain matches with rank=1. There are a total of 586 ORFs that have at least 200 bp with 50% identity. The table contains the following information:

  1. b-number: corresponds to the non-redundant identifier assigned to each ORF by the Blattner Lab
  2. Gene: common name of an ORF; if unnamed this field contains the b-number
  3. Rank: if there are multiple paralogs for an ORF, this field contains their ranking
  4. Paralog b-number: the b-number of the paralogous sequence
  5. Paralog gene: the common name for the paralogous ORF
  6. Align length: the length of the alignable region between the target and query ORFs
  7. Pct Identity: the percent identity within the aligned region
  8. Gaps: the number of gaps in the alignment used to compute the percent identity
  9. No. of paralogs: the number of paralogous ORFs in the K-12 genome (maximum of 9)

Jeremy Glasner 9/27/99

