期刊文献+

基于Attention Bi-LSTM模型构建蛋白质诱饵序列库 被引量:2

Construction of protein decoy sequence database based on AttentionBi-LSTM model
下载PDF
导出
摘要 利用计算机技术在海量质谱数据中鉴定蛋白质序列是蛋白质组学研究最基本且重要的任务之一,诱饵序列库构建的好坏是蛋白质鉴定质量控制成功的关键之一。发展了基于注意力机制-双向长短期记忆神经网络(Attention Bi-LSTM)的诱饵序列构建方法,整体研究基于编码-解码框架,采用双向长短期记忆神经网络在解决传统循环神经网络梯度消失问题的同时,可以捕获前向后向更多依赖信息对处理序列数据更加有优势;引入注意力机制提高模型对目标序列库和诱饵序列库相关程度的关注度;并与目前常用的随机和反转算法进行比较。结果显示,基于Attention Bi-LSTM模型构建的诱饵序列库能满足理想诱饵序列库的各项特征要求;在不同大小实验数据集以及谱图、肽段、蛋白3个层面对比分析,显示构建的诱饵序列库与其他方法比具有更好的灵敏性。因此,Attention Bi-LSTM是一种很有潜力的诱饵序列库构建方法。 Identification of all necessary proteins using computer technology is the most essential task of proteomics research,and the construction of bait sequence library is one of the keys to the success of its quality control.In this study,by introducing of the recurrent neural network model,a decoy sequence database construction method has been developed based on the Attention Bi-LSTM model.The whole work is constructed on encoder-decoder framework.The bidirectional long short-term memory network is used to solve the gradient disappearance of traditional RNN model,and to capture more information about the forward and backward sequences.Also,the attention mechanism is incorporated to improve the model’s attention to the correlation between target and decoy sequences.The experimental results show that the composition characteristics of decoy sequence database generated by the Attention Bi-LSTM based model are similar to those of the target database;and the performance of this method is shown to be superior to reversed and randomized models on sensitivity of spectrum,peptide and protein identification on different experiment datasets.Therefore,it can be concluded that Attention Bi-LSTM is a promising method for constructing bait sequence libraries.
作者 曾祥利 马洁 朱云平 舒坤贤 ZENG Xiangli;MA Jie;ZHU Yunping;SHU Kunxian(Chongqing Key Laboratory on Big Data for Bio Intelligence,College of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,P.R.China;National Key Laboratory of Proteomics,Beijing Proteome Research Center,National Center for Protein Sciences(Beijing),Beijing 102206,P.R.China)
出处 《重庆邮电大学学报(自然科学版)》 CSCD 北大核心 2020年第4期655-663,共9页 Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)
基金 国家自然科学基金(61501071,21475150) 国家高技术研究发展计划(2015AA020108,2015AA020101)。
关键词 蛋白质鉴定 诱饵序列库 长短期记忆神经网络 注意力机制 protein identification decoy sequence long-term and short-term memory neural network attention mechanism
  • 相关文献

参考文献3

二级参考文献11

共引文献28

同被引文献21

引证文献2

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部