Large-scale, multi-genome analysis of alternate open reading frames in bacteria and archaea

Felipe Veloso, Gonzalo Riadi, Daniela Aliaga, Ryan Lieph, David S. Holmes

Research output: Contribution to journalReview articlepeer-review

15 Citations (Scopus)

Abstract

Analysis of over 300,000 annotated genes in 105 bacterial and archaeal genomes reveals an unexpectedly high frequency of large (>300 nucleotides) alternate open reading frames (ORFs). Especially notable is the very high frequency of alternate ORFs in frames +3 and -1 (where the annotated gene is defined as frame + 1). The occurrence of alternate ORFs is correlated with genomic G+C content and is strongly influenced by synonymous codon usage bias. The frequency of alternate ORFs in frame -1 is also influenced by the occurrence of codons encoding leucine and serine in frame +1. Although some alternate ORFs have been shown to encode proteins, many others are probably not expressed because they lack appropriate signals for transcription and translation. These latter can be mis-annotated by automatic gene finding programs leading to errors in public databases. Especially prone to mis-annotation is frame -1, because it exhibits a potential codon usage and theoretical capacity to encode proteins with an amino acid composition most similar to real genes. Some alternate ORFs are conserved across bacterial or archaeal species, and can give rise to mis-annotated "conserved hypothetical" genes, while others are unique to a genome and are misidentified as "hypothetical orphan" genes, contributing significantly to the orphan gene paradox.

Original languageEnglish
Pages (from-to)91-105
Number of pages15
JournalOMICS A Journal of Integrative Biology
Volume9
Issue number1
DOIs
Publication statusPublished - 2005

ASJC Scopus subject areas

  • Biotechnology
  • Biochemistry
  • Molecular Medicine
  • Molecular Biology
  • Genetics

Fingerprint

Dive into the research topics of 'Large-scale, multi-genome analysis of alternate open reading frames in bacteria and archaea'. Together they form a unique fingerprint.

Cite this