摘要
使用同源模板能够有效的提升蛋白质结构预测的精度,然而,对于部分多域蛋白,PDB中可用的同源模板较少,这可能会影响预测精度.为了进一步提高多域蛋白质的建模精度,本文提出了基于结构类似模板结合同源模板的端到端多域组装方法MTDA.首先,搜索序列数据库生成多序列比对,以及分别搜索PDB100和MPDB生成同源模板和结构类似模板;进而提取序列特征、模板特征和单域特征;然后通过一个基于EfficientNetV2架构和注意力机制相结合的神经网络来预测多域蛋白质的域间方位从而直接将多个单域结构组装为全链结构.在125个测试蛋白和65个人类蛋白上的实验结果表明,MTDA优于仅使用同源模板的端到端组装方法E2EDA以及全链建模方法AlphaFold2.
The use of homologous templates can effectively improve the accuracy of protein structure prediction.However,for some multi-domain proteins,there are fewer homologous templates available in PDB,which may affect the prediction accuracy.In order to further improve the modeling accuracy of multi-domain proteins,we propose an end-to-end multi-domain assembly method MTDA that combines structural analogue templates and homologous templates.First,search the sequence database to generate multiple sequence alignments,and search PDB100 and MPDB respectively to generate homologous templates and structural analogue templates,thereby extracting sequence features,template features,and single-domain features.Then,a neural network based on EfficientNetV2 architecture and attention mechanism is used to predict the inter-domain orientation of multi-domain proteins.Finally,multiple single-domain structures are directly assembled into a full-chain structure.Experimental results on 125 tested proteins and 65 human proteins show that MTDA outperforms E2EDA,an end-to-end assembly method using only homologous templates,and AlphaFold2,a full-chain modeling method.
作者
朱海涛
夏瑜豪
张贵军
ZHU Haitao;XIA Yuhao;ZHANG Guijun(College of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2024年第8期1825-1831,共7页
Journal of Chinese Computer Systems
基金
国家重点研发项目(2019YFE0126100)资助
国家自然科学基金项目(62173304)资助.
关键词
多域蛋白质
模板建模
深度学习
结构域组装
域间方位预测
multi-domain proteins
template modeling
deep learning
domain assembly
inter-domain orientation prediction