1.Predicting the Subcellular Localization of Human Proteins Using Machine Learning and Exploratory Data Analysis
Acquaah-Mensah K. GEORGE ; Leach M. SONIA ; Guda CHITTIBABU
Genomics, Proteomics & Bioinformatics 2006;4(2):120-133
Identifying the subcellular localization of proteins is particularly helpful in the functional annotation of gene products. In this study, we use Machine Learning and Exploratory Data Analysis (EDA) techniques to examine and characterize amino acid sequences of human proteins localized in nine cellular compartments. A dataset of 3,749 protein sequences representing human proteins was extracted from the SWISS-PROT database. Feature vectors were created to capture specific amino acid sequence characteristics. Relative to a Support Vector Machine, a Multi-layer Perceptron, and a Naive Bayes classifier, the C4.5 Decision Tree algorithm was the most consistent performer across all nine compartments in reliably predicting the subcellular localization of proteins based on their amino acid sequences (average Precision=0.88; average Sensitivity=0.86). Furthermore, EDA graphics characterized essential features of proteins in each compartment. As examples,proteins localized to the plasma membrane had higher proportions of hydrophobic amino acids; cytoplasmic proteins had higher proportions of neutral amino acids;and mitochondrial proteins had higher proportions of neutral amino acids and lower proportions of polar amino acids. These data showed that the C4.5 classifier and EDA tools can be effective for characterizing and predicting the subcellular localization of human proteins based on their amino acid sequences.
2.Reconstruction of pathways associated with amino acid metabolism in human mitochondria.
Purnima GUDA ; Chittibabu GUDA ; Shankar SUBRAMANIAM
Genomics, Proteomics & Bioinformatics 2007;5(3-4):166-176
We have used a bioinformatics approach for the identification and reconstruction of metabolic pathways associated with amino acid metabolism in human mitochondria. Human mitochondrial proteins determined by experimental and computational methods have been superposed on the reference pathways from the KEGG database to identify mitochondrial pathways. Enzymes at the entry and exit points for each reconstructed pathway were identified, and mitochondrial solute carrier proteins were determined where applicable. Intermediate enzymes in the mitochondrial pathways were identified based on the annotations available from public databases, evidence in current literature, or our MITOPRED program, which predicts the mitochondrial localization of proteins. Through integration of the data derived from experimental, bibliographical, and computational sources, we reconstructed the amino acid metabolic pathways in human mitochondria, which could help better understand the mitochondrial metabolism and its role in human health.
Amino Acid Metabolism, Inborn Errors
;
genetics
;
metabolism
;
Amino Acids
;
metabolism
;
Computational Biology
;
Databases, Protein
;
Humans
;
Mitochondria
;
metabolism
;
Mitochondrial Proteins
;
genetics
;
metabolism
;
Models, Biological
;
Proteomics
3.Reconstruction of Pathways Associated with Amino Acid Metabolism in Human Mitochondria
Guda PURNIMA ; Guda CHITTIBABU ; Subramaniam SHANKAR
Genomics, Proteomics & Bioinformatics 2007;2(3):166-176
We have used a bioinformatics approach for the identification and reconstruction of metabolic pathways associated with amino acid metabolism in human mitochon- dria. Human mitochondrial proteins determined by experimental and computa- tional methods have been superposed on the reference pathways from the KEGG database to identify mitochondrial pathways. Enzymes at the entry and exit points for each reconstructed pathway were identified, and mitochondrial solute carrier proteins were determined where applicable. Intermediate enzymes in the mito- chondrial pathways were identified based on the annotations available from public databases, evidence in current literature, or our MITOPRED program, which pre- dicts the mitochondrial localization of proteins. Through integration of the data derived from experimental, bibliographical, and computational sources, we recon- structed the amino acid metabolic pathways in human mitochondria, which could help better understand the mitochondrial metabolism and its role in human health.