摘要
大量的质谱数据无法被鉴定或是鉴定的精度不够高,特别是在肽段数据库较大时,普通的算法精度下降很快。提出一种新的盲搜索算法,此算法基于一种全新的基于相似关系度量的打分模型。为了处理大规模问题,同时还应用了基于母离子质量和肽序列标签的前过滤方法,使得此算法在较大规模的数据库上精度得到很好的保证。实验结果表明,对于规模为10000,20000,50000的肽段数据库,其鉴定准确率分别为78.3%,74.2%,65.5%。随着数据库规模的增大,算法的鉴定准确率保持得较好。
Lots of spectrum data can not be identified or identified with low accuracy, especially in the case of large scale database, the former algorithm loses accuracy lastly. This paper presented a new blind search algorithm. This algorithm is based on a kind of brand-new score model based on similarity relationship measurement. For large scale question, the agorithm takes two pre-filtering strategies such as parention mass filtering and Peptide Sequence Tags (PST) filtering, so that it can guarantee the accurancy in large scale question. The experimental results show that, in the little sacrifice of time cost, under the peptide dataset of scale for 10 000, 20 000, 50 000, its identification accuracy achieves 78.3%, 74.2%, 65.5% respectively. With the increase of the database scale, the accuracy of the algorithm keeps a high level.
出处
《计算机应用》
CSCD
北大核心
2012年第5期1488-1490,共3页
journal of Computer Applications
基金
国家自然科学基金重点项目(60533020)
关键词
肽序列标签
翻译后修饰
质谱
盲搜索
大规模
Peptide Sequence Tag (PST)
Post Translation Modification (PTM)
tandem mass spectrum (MS/MS)
blind search
large scale