Description
Technological advances within the past two decades have allowed researchers the capacity to sequence and analyze any genome. However, while many organisms have had their genomes sequenced, hardly any have been manually interpreted to propose the organism’s hypothetical protein-coding genes’ functions. The primary amino acid sequences of the organism’s genes can be used to offer functions for the organism’s proteins. Online programs such as BLAST, T-COFFEE, MUSCLE, TMHMM, SignalP, Phobius, and PSORTb, identify several properties such as conserved domains, protein families, protein conformations, signal peptides, and transmembrane regions of the sequences. These elements can be examined to provide insight into likely organismal lineage, protein functions, location, or intended use of the protein by the organism. This research was conducted with the purpose of proposing functions for five unannotated hypothetical protein-coding genes within the genome of the bacterium Yersinia pestis, which was most notably the bacteria responsible for the Bubonic plague. The genes, A1122_RS20830, A1122_RS20870, A1122_RS21135, A1122_RS21140, and A1122_RS21150, were analyzed and predicted to code for the following proteins: DNA polymerase III subunit alpha, an intermembrane protease, a domain protein of the enzyme diguanylate cyclase, a pseudogene, and a major facilitator superfamily (MFS) transporter, respectively. These predictions allow some insight into the functions of the proteome of Yersinia pestis. However, these hypothetical gene annotations must be validated through molecular research involving experimentation, such as biochemical manipulations, to determine whether Yersinia pestis expresses these proteins and whether they engage in their bioinformatically proposed functions.
Keywords:
Bioinformatics, gene annotation, Yersinia pestis
Department
Biology
College
College of Science and Engineering
Included in
Gene Annotation of the Hypothetical Protein-Coding Genes of Yersinia Pestis
Technological advances within the past two decades have allowed researchers the capacity to sequence and analyze any genome. However, while many organisms have had their genomes sequenced, hardly any have been manually interpreted to propose the organism’s hypothetical protein-coding genes’ functions. The primary amino acid sequences of the organism’s genes can be used to offer functions for the organism’s proteins. Online programs such as BLAST, T-COFFEE, MUSCLE, TMHMM, SignalP, Phobius, and PSORTb, identify several properties such as conserved domains, protein families, protein conformations, signal peptides, and transmembrane regions of the sequences. These elements can be examined to provide insight into likely organismal lineage, protein functions, location, or intended use of the protein by the organism. This research was conducted with the purpose of proposing functions for five unannotated hypothetical protein-coding genes within the genome of the bacterium Yersinia pestis, which was most notably the bacteria responsible for the Bubonic plague. The genes, A1122_RS20830, A1122_RS20870, A1122_RS21135, A1122_RS21140, and A1122_RS21150, were analyzed and predicted to code for the following proteins: DNA polymerase III subunit alpha, an intermembrane protease, a domain protein of the enzyme diguanylate cyclase, a pseudogene, and a major facilitator superfamily (MFS) transporter, respectively. These predictions allow some insight into the functions of the proteome of Yersinia pestis. However, these hypothetical gene annotations must be validated through molecular research involving experimentation, such as biochemical manipulations, to determine whether Yersinia pestis expresses these proteins and whether they engage in their bioinformatically proposed functions.