摘要
目的对5种生物信息学软件(SIFT、PolyPhen2、Mutation Taster、Provean、Mutation Assessor)的预测性能进行评估。方法从自有突变数据库、中文数据库、人类基因突变数据库、dbSNP数据库中检索并收集121个具有明确功能学研究的错义突变以及121个家系分析提示具有致病性的错义突变作为阳性金标准,242个显性遗传病致病基因上最小等位基因频率〉5%的错义突变作为阴性金标准,用上述软件对其进行预测。用敏感度、特异度、阳性预测值、假阳性率、阴性预测值、假阴性率、错误发现率、准确度、受试者工作特征曲线等9个指标评估5种软件的预测性能。结果从敏感度、阴性预测值和假阴性率的指标进行评估,5种软件的排名依次为MutationTaster、PolyPhen2、Provean、SIFT、Mutation Assessor;从特异度和假阳性率指标进行评估,其排名依次为MutationTaster、Provean、MutationAssessor、SIFT、PolyPhen2;从阳性预测值和错误发现率指标进行评估,其排名依次为MutationTaster、Provean、MutationAssessor、PolyPhen2、SIFT;从曲线下面积值和准确度指标进行评估,其排名依次为MutationTaster、Provean、PolyPhen2、MutationAssessor、SIFT。结论各软件在使用不同指标参数进行评估时的性能有所不同,其中MutationTaster软件从9个指标参数评估性能为最佳。
Objective To study the prediction performance evaluation with five kinds of bioinformatics software (SIFT, PolyPhen2, MutationTaster, Provean, MutationAssessor). Methods From own database for genetic mutations collected over the past five years, Chinese literature database, Human Gene Mutation Database, and dbSNP, 121 missense mutations confirmed by functional studies, and 121 missense mutations suspected to be pathogenic by pedigree analysis were used as positive gold standard, while 242 missense mutations with minor allele frequency (MAF) 〉 5% in dominant hereditary diseases were used as negative gold standard. The selected mutations were predicted with the five software. Based on the results, the performance of the five software was evaluated for their sensitivity, specificity, positive predict value, false positive rate, negative predict value, false negative rate, false discovery rate, accuracy, and receiver operating characteristic curve (ROC). Results In terms of sensitivity, negative predictive value and false negative rate, the rank was MutationTaster, PolyPhen2, Provean, SIFT, and MutationAssessor. For specificity and false positive rate, the rank was MutationTaster, Provean, MutationAssessor, SIFT, and PolyPhen2. For positive predict value and false discovery rate, the rank was MutationTaster, Provean, MutationAssessor, PolyPhen2, and SIFT. For area under the ROC curve (AUC) and accuracy, the rank was MutationTaster, Provean, PolyPhen2, MutationAssessor, and SIFT. Conclusion The prediction performance of software may be different when using different parameters. Among the five software,MutationTaster has the best prediction performance.
出处
《中华医学遗传学杂志》
CAS
CSCD
北大核心
2016年第5期625-628,共4页
Chinese Journal of Medical Genetics
关键词
错义突变
生物信息学软件
性能评估
Missense mutation
Bioinformatics software
Performance evaluation