1.Evaluation of performance of five bioinformatics software for the prediction of missense mutations.
Qianting CHEN ; Congling DAI ; Qianjun ZHANG ; Juan DU ; Wen LI
Chinese Journal of Medical Genetics 2016;33(5):625-628
OBJECTIVETo study the prediction performance evaluation with five kinds of bioinformatics software (SIFT, PolyPhen2, MutationTaster, Provean, MutationAssessor).
METHODSFrom own database for genetic mutations collected over the past five years, Chinese literature database, Human Gene Mutation Database, and dbSNP, 121 missense mutations confirmed by functional studies, and 121 missense mutations suspected to be pathogenic by pedigree analysis were used as positive gold standard, while 242 missense mutations with minor allele frequency (MAF)>5% in dominant hereditary diseases were used as negative gold standard. The selected mutations were predicted with the five software. Based on the results, the performance of the five software was evaluated for their sensitivity, specificity, positive predict value, false positive rate, negative predict value, false negative rate, false discovery rate, accuracy, and receiver operating characteristic curve (ROC).
RESULTSIn terms of sensitivity, negative predictive value and false negative rate, the rank was MutationTaster, PolyPhen2, Provean, SIFT, and MutationAssessor. For specificity and false positive rate, the rank was MutationTaster, Provean, MutationAssessor, SIFT, and PolyPhen2. For positive predict value and false discovery rate, the rank was MutationTaster, Provean, MutationAssessor, PolyPhen2, and SIFT. For area under the ROC curve (AUC) and accuracy, the rank was MutationTaster, Provean, PolyPhen2, MutationAssessor, and SIFT.
CONCLUSIONThe prediction performance of software may be different when using different parameters. Among the five software, MutationTaster has the best prediction performance.
Computational Biology ; methods ; DNA Mutational Analysis ; methods ; Gene Frequency ; Humans ; Mutation, Missense ; genetics ; Polymorphism, Single Nucleotide ; genetics ; Reproducibility of Results ; Software