期刊文献+

基于深度卷积神经网络的无序蛋白质功能模体的识别 被引量:1

Identifying Molecular Recognition Feature in Disordered Proteins with Deep Convolutional Neural Network
下载PDF
导出
摘要 针对目前实验方法识别天然无序蛋白质中的功能模体耗时费力、难度大,而传统计算机辅助识别方法过于依赖人工挑选特征且准确度低等问题,提出一种利用深度卷积神经网络预测功能模体位置的方法;该方法直接将蛋白质序列作为输入,通过计算对应的位置特异性打分矩阵和3组氨基酸指数特征,将序列映射到数值矩阵中,模型自行抽取特征并自动识别功能模体的隐性序列模式来进行预测。结果表明:当使用相同数据集进行训练和测试时,本文中提出的方法的性能明显优于其他传统的识别算法,在验证集上的感受性曲线下的面积(AUC)值达到0.708,在测试集上的AUC值达到0.760,说明深度卷积神经网络能够有效地识别功能模体的隐性序列模式;该方法也可以用于其他聚集型蛋白质功能位点的识别。 Aiming at the problem that identifying molecular recognition feature( MoRF) in intrinsic disordered proteins was complicated and difficult,while traditional prediction algorithms generally relied on artificial feature extraction and their accuracy was still low,a novel method based on deep convolution neural network was proposed for identifying MoRF in protein sequence. This method took the protein sequence as input directly,and maped the sequence to a feature matrix by calculating the position-specific scoring matrix of the sequence and three groups of amino acid indexes. The deep learning model extracted features and identified the recessive sequence pattern of MoRF automatically. The experimental results show that,using the same training and testing datasets,the proposed method obviously outperformes other traditional methods,achieving the value of area under curve( AUC) of the receiver operating characteristics 0.708 on the validation dataset and the AUC value 0.760 on the test dataset,which suggests that the deep convolution neural network provides an effective way to improve the MoRFs predication. This method can also be used to identify other aggregated functional sites of proteins.
作者 方春 田爱奎 孙福振 李彩虹 朱大铭 FANG Chun;TIAN Aikui;SUN Fuzhen;LI Caihong;ZHU Darning(School of Computer Science and Technology,Shandong University of Technology,Zibo 255049,China;Shandong Provincial Key Laboratory of Software Engineering,Shandong University,Jinan 250000,China)
出处 《济南大学学报(自然科学版)》 CAS 北大核心 2018年第4期280-285,共6页 Journal of University of Jinan(Science and Technology)
基金 国家自然科学基金项目(61602280 61473179) 山东省自然科学基金项目(ZR2014FQ028)
关键词 深度卷积神经网络 无序蛋白质 序列模式 识别 deep convolutional neural network disordered protein sequence pattern identification
  • 相关文献

参考文献2

二级参考文献172

  • 1王克夷.天然无折叠蛋白质[J].生命的化学,2006,26(3):199-202. 被引量:3
  • 2Fischer, E. Ber. Dt. Chem. Ges., 1894, 27:2985.
  • 3Wu, H. Chin. J. Physiol., 1931, 1:219.
  • 4Mirsky, A. E.; Pauling, L. Proc. Natl. Acad. Sci. U. S. A., 1936, 22:439.
  • 5Uversky, V. N. Protein Sci., 2002, 11:739.
  • 6Dunker, A. K.; Brown, C. J.; Lawson, J. D.; Iakoucheva, L. M.; Obradovie, Z. Biochemistry, 2002, 41:6573.
  • 7Wright, P. E.; Dyson, H. J. J. Mol. Biol., 1999, 293:321.
  • 8Schweers, O.; Schonbrunn-Hanebeck, E.; Marx, A.; Mandelkow, E. J. Biol. Chem., 1994, 269:24290.
  • 9Sickmeier, M.; Hamilton, J. A.; LeGall, T.; Vacic, V.; Cortese, M. S.; Tantos, A.; Szabo, B.; Tompa, P.; Chen, J.; Uversky, V. N.; Obradovic, Z.; Dunker, A. K. Nucleic Acids Res., 2007, 35:D786.
  • 10Dunker, A. K.; Lawson, J. D.; Brown, C. J.; Williams, R. M.; Romero, P.; Oh, J. S.; Oldfield, C. J.; Campen, A. M.; Ratliff, C. M.; Hipps, K. W.; Ausio, J.; Nissen, M. S.; Reeves, R.; Kang, C.; Kissinger, C. R.; Bailey, R. W.; Griswold, M. D.; Chiu, W.; Garner, E. C.; Obradovic, Z. J. Mol. Graph. Model., 2001, 19:26.

共引文献27

同被引文献9

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部