Description
Genetic information of organisms and microorganisms has become readily accessible due to advances in genomic sequencing and bioinformatic technology. Despite these advances, there are numerous organisms with genome sequences that have yet to be annotated. Many of these genome sequences require manual annotation, which can uncover hypothetical protein-coding genes. Through the use of publicly available online bioinformatics tools, such as BLAST, T- COFFEE, TMHMM, SignalP, Phobius, and PSORTb, the functions of hypothetical protein- coding genes can be predicted from primary amino acid sequences. Two clusters of properties that aid in determining and predicting the hypothetical genes involve sequence similarity and protein localization. The bioinformatic programs can identify properties such as protein families, conserved domains, signal peptides, and transmembrane regions that belong to the respective clusters. This research project aims to predict the functions of five unannotated hypothetical protein-coding genes in the genome of the bacterium Coxiella burnetii. The genes BMW92_RS10760, BMW92_RS10830, BMW92_RS10835, BMW92_RS10840, and BMW92_RS10855 were analyzed and predicted to code for the following proteins: uroporphyrinogen-III synthase, pyrroline-5-carboxylate reductase, pyridoxal phosphate- dependent enzyme, phosphoenolpyruvate carboxykinase, and aspartate carbamoyltransferase, respectively. The predicted functions of the hypothetical protein-coding genes provide insight into the proteome of C. burnetii. Ultimately, the proposed gene annotations must be validated through molecular cloning and biochemical methods to determine if these proteins are indeed expressed by C. burnetii and carry out their predicted functions.
Keywords:
Coxiella burnetii, gene annotation, gene function, bioinformatics
Department
Biology
College
College of Science and Engineering
Included in
Gene Annotation of the Hypothetical Protein-Coding Genes of Coxiella Burnetii
Genetic information of organisms and microorganisms has become readily accessible due to advances in genomic sequencing and bioinformatic technology. Despite these advances, there are numerous organisms with genome sequences that have yet to be annotated. Many of these genome sequences require manual annotation, which can uncover hypothetical protein-coding genes. Through the use of publicly available online bioinformatics tools, such as BLAST, T- COFFEE, TMHMM, SignalP, Phobius, and PSORTb, the functions of hypothetical protein- coding genes can be predicted from primary amino acid sequences. Two clusters of properties that aid in determining and predicting the hypothetical genes involve sequence similarity and protein localization. The bioinformatic programs can identify properties such as protein families, conserved domains, signal peptides, and transmembrane regions that belong to the respective clusters. This research project aims to predict the functions of five unannotated hypothetical protein-coding genes in the genome of the bacterium Coxiella burnetii. The genes BMW92_RS10760, BMW92_RS10830, BMW92_RS10835, BMW92_RS10840, and BMW92_RS10855 were analyzed and predicted to code for the following proteins: uroporphyrinogen-III synthase, pyrroline-5-carboxylate reductase, pyridoxal phosphate- dependent enzyme, phosphoenolpyruvate carboxykinase, and aspartate carbamoyltransferase, respectively. The predicted functions of the hypothetical protein-coding genes provide insight into the proteome of C. burnetii. Ultimately, the proposed gene annotations must be validated through molecular cloning and biochemical methods to determine if these proteins are indeed expressed by C. burnetii and carry out their predicted functions.