FragAnchor: a large-scale predictor of glycosylphosphatidylinositol anchors in eukaryote protein sequences by qualitative scoring.
- Author:
Guylaine POISSON
1
;
Cedric CHAUVE
;
Xin CHEN
;
Anne BERGERON
Author Information
1. Department of Information and Computer Sciences, University of Hawaii at Manoa, Honolulu, HI 96822, USA. guylaine@hawaii.edu
- Publication Type:Journal Article
- MeSH:
Amino Acid Sequence;
Computational Biology;
methods;
Databases, Protein;
Eukaryotic Cells;
chemistry;
Glycosylphosphatidylinositols;
chemistry;
isolation & purification;
metabolism;
Humans;
Hydrophobic and Hydrophilic Interactions;
Markov Chains;
Models, Genetic;
Molecular Sequence Data;
Neural Networks (Computer);
Predictive Value of Tests;
Protein Processing, Post-Translational;
Proteome;
analysis;
Sensitivity and Specificity;
Sequence Analysis, Protein
- From:
Genomics, Proteomics & Bioinformatics
2007;5(2):121-130
- CountryChina
- Language:English
-
Abstract:
A glycosylphosphatidylinositol (GPI) anchor is a common but complex C-terminal post-translational modification of extracellular proteins in eukaryotes. Here we investigate the problem of correctly annotating GPI-anchored proteins for the growing number of sequences in public databases. We developed a computational system, called FragAnchor, based on the tandem use of a neural network (NN) and a hidden Markov model (HMM). Firstly, NN selects potential GPI-anchored proteins in a dataset, then HMM parses these potential GPI signals and refines the prediction by qualitative scoring. FragAnchor correctly predicted 91% of all the GPI-anchored proteins annotated in the Swiss-Prot database. In a large-scale analysis of 29 eukaryote proteomes, FragAnchor predicted that the percentage of highly probable GPI-anchored proteins is between 0.21% and 2.01%. The distinctive feature of FragAnchor, compared with other systems, is that it targets only the C-terminus of a protein, making it less sensitive to the background noise found in databases and possible incomplete protein sequences. Moreover, FragAnchor can be used to predict GPI-anchored proteins in all eukaryotes. Finally, by using qualitative scoring, the predictions combine both sensitivity and information content. The predictor is publicly available at [see text].