TY - GEN
T1 - Predicting the function of hypothetical genes in genomes of bioleaching microorganisms
AU - Ossandón, F. J.
AU - Rivera, G.
AU - Lazo, F.
AU - Holmes, D. S.
PY - 2009
Y1 - 2009
N2 - A particularly challenging problem in genome annotation is to attribute function to genes annotated as "hypothetical, no known function". These typically account for about 40% of all genes regardless of the genome. Some of these are "orphan" genes and are not found in any other genome. Some of these could encode species specific proteins and so are particularly interesting for evaluating novel metabolic potential and for understanding the evolution of genes and genomes. Several similarity and non-similarity bioinformatics tools exist that help predict function of hypotheticals, but none are able to suggest function for more than a few percent and the annotation of the others remains a formidable task. We have developed a bioinformatics tool called AlterORF (www.AlterORF.cl) that is able to identify alternate open reading frames (ORFs) embedded within annotated genes. Analysis of over 2 million genes in over 700 completely sequenced genomes reveals that alternate ORFs of substantial length (potentially encoding 70 amino acids or more) are surprisingly common, especially in G+C rich genomes. During our examination of these alternate ORFs, we uncovered hundreds of examples where the alternate ORF has a significant hit with databases of motifs and domains (e.g. CDD, Pfam) and where the actual annotated gene is described as hypothetical and has no database match. This strongly suggests that the annotated gene has been incorrectly identified and that the alternate ORF is the real gene. We describe the evaluation of the following genomes of bioleaching microorganisms and others that reside in similar ecological niches using AlterORF: Acidithiobacillus ferrooxidans (2 strains), Leptospirillum type II, Methylacidiphilum infernorum, Picrophilus torridus, Sulfolobus acidocaldarius, S. solfataricus, S. tokodaii, Thermodesulfovibrio yellowstonii, Thermoplasma acidophilum and T. volcanium. Examples of novel genes from these microorganisms and their suggested roles in metabolism will be described.
AB - A particularly challenging problem in genome annotation is to attribute function to genes annotated as "hypothetical, no known function". These typically account for about 40% of all genes regardless of the genome. Some of these are "orphan" genes and are not found in any other genome. Some of these could encode species specific proteins and so are particularly interesting for evaluating novel metabolic potential and for understanding the evolution of genes and genomes. Several similarity and non-similarity bioinformatics tools exist that help predict function of hypotheticals, but none are able to suggest function for more than a few percent and the annotation of the others remains a formidable task. We have developed a bioinformatics tool called AlterORF (www.AlterORF.cl) that is able to identify alternate open reading frames (ORFs) embedded within annotated genes. Analysis of over 2 million genes in over 700 completely sequenced genomes reveals that alternate ORFs of substantial length (potentially encoding 70 amino acids or more) are surprisingly common, especially in G+C rich genomes. During our examination of these alternate ORFs, we uncovered hundreds of examples where the alternate ORF has a significant hit with databases of motifs and domains (e.g. CDD, Pfam) and where the actual annotated gene is described as hypothetical and has no database match. This strongly suggests that the annotated gene has been incorrectly identified and that the alternate ORF is the real gene. We describe the evaluation of the following genomes of bioleaching microorganisms and others that reside in similar ecological niches using AlterORF: Acidithiobacillus ferrooxidans (2 strains), Leptospirillum type II, Methylacidiphilum infernorum, Picrophilus torridus, Sulfolobus acidocaldarius, S. solfataricus, S. tokodaii, Thermodesulfovibrio yellowstonii, Thermoplasma acidophilum and T. volcanium. Examples of novel genes from these microorganisms and their suggested roles in metabolism will be described.
KW - "Orphan genes"
KW - Bioleaching microorganisms
UR - http://www.scopus.com/inward/record.url?scp=72449164887&partnerID=8YFLogxK
U2 - 10.4028/www.scientific.net/AMR.71-73.203
DO - 10.4028/www.scientific.net/AMR.71-73.203
M3 - Conference contribution
AN - SCOPUS:72449164887
SN - 0878493220
SN - 9780878493227
T3 - Advanced Materials Research
SP - 203
EP - 206
BT - Biohydrometallurgy 2009
T2 - 18th International Biohydrometallurgy Symposium, IBS 2009
Y2 - 13 September 2009 through 17 September 2009
ER -