Protein Residue Contact Prediction Based on Deep Learning and Massive Statistical Features from Multi-Sequence Alignment

导出

摘要 Sequence-based protein tertiary structure prediction is of fundamental importance because the function of a protein ultimately depends on its 3 D structure.An accurate residue-residue contact map is one of the essential elements for current ab initio prediction protocols of 3 D structure prediction.Recently,with the combination of deep learning and direct coupling techniques,the performance of residue contact prediction has achieved significant progress.However,a considerable number of current Deep-Learning(DL)-based prediction methods are usually time-consuming,mainly because they rely on different categories of data types and third-party programs.In this research,we transformed the complex biological problem into a pure computational problem through statistics and artificial intelligence.We have accordingly proposed a feature extraction method to obtain various categories of statistical information from only the multi-sequence alignment,followed by training a DL model for residue-residue contact prediction based on the massive statistical information.The proposed method is robust in terms of different test sets,showed high reliability on model confidence score,could obtain high computational efficiency and achieve comparable prediction precisions with DL methods that relying on multi-source inputs.

作者 Huiling Zhang Min Hao Hao Wu Hing-Fung Ting Yihong Tang Wenhui Xi Yanjie Wei

机构地区 Shenzhen Institutes of Advanced Technology University of Chinese Academy of Sciences College of Electronic and Information Engineering School of Software Engineering Department of Computer Science School of Computer Science

出处《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2022年第5期843-854,共12页 清华大学学报（自然科学版（英文版）

基金 supported by the Strategic Priority CAS Project (No. XDB38050100) the National Key Research and Development Program of China (No. 2018YFB0204403) the National Natural Science Foundation of China (No. U1813203) the Shenzhen Basic Research Fund (Nos. RCYX2020071411473419,JCYJ20200109114818703,and JSGG20201102163800001) CAS Key Lab (No. 2011DP173015) Hong Kong Research Grant Council (No. GRF-17208019) the Outstanding Youth Innovation Fund (Doctoral Students) of CAS-SIAT (No. Y9G054)。

关键词 multi-sequence alignment residue-residue contact prediction feature extraction statistical information Deep Learning(DL) high computational efficiency

分类号 Q518.2 [生物学—生物化学] TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献1

1Jiashuai Zhang,Wenkai Li,Min Zeng,Xiangmao Meng,Lukasz Kurgan,Fang-Xiang Wu,Min Li.NetEPD: A Network-Based Essential Protein Discovery Platform[J].Tsinghua Science and Technology,2020,25(4):542-552. 被引量：2

二级参考文献3

1Wooyoung Kim.Prediction of Essential Proteins Using Topological Properties in GO-Pruned PPI Network Based on Machine Learning Methods[J].Tsinghua Science and Technology,2012,17(6):645-658. 被引量：4
2Caiguo Zhang.Essential functions of iron-requiring proteins n DNA replication, repair and cell cycle contro[J].Protein & Cell,2014,5(10):750-760. 被引量：17
3Min Li,Zhibei Niu,Xiaopei Chen,Ping Zhong,Fangxiang Wu,Yi Pan.A Reliable Neighbor-Based Method for Identifying Essential Proteins by Integrating Gene Expressions, Orthology,and Subcellular Localization Information[J].Tsinghua Science and Technology,2016,21(6):668-677. 被引量：2

共引文献1

1Xiangmao MENG,Wenkai LI,Xiaoqing PENG,Yaohang LI,Min LI.Protein interaction networks:centrality,modularity,dynamics,and applications[J].Frontiers of Computer Science,2021,15(6):1-17. 被引量：2

1Xiaoxu Xu,Tianyu Du,Weihua Mao,Xiaohan Li,Chu-Yu Ye,Qian-Hao Zhu,Longjiang Fan,Qinjie Chu.PlantcircBase 7.0:Full-length transcripts and conservation of plant circRNAs[J].Plant Communications,2022,3(4):64-71.
2Miguel-Angel Sicilia,Elena García-Barriocanal,Marçal Mora-Cantallops,Salvador Sánchez-Alonso,Lino González.Modeling Bacterial Species: Using Sequence Similarity with Clustering Techniques[J].Computers, Materials & Continua,2021(8):1661-1672.
3Pengyue Gao,Bo Gao,Shaohua Lu,Hanyu Liu,Jian Lv,Yanchao Wang,Yanming Ma.Structure search of two-dimensional systems using CALYPSO methodology[J].Frontiers of physics,2022,17(2):121-134.
4Javaria Amin,Muhammad Sharif,Muhammad Almas Anjum,Yunyoung Nam,Seifedine Kadry,David Taniar.Diagnosis of COVID-19 Infection Using Three-Dimensional Semantic Segmentation and Classification of Computed Tomography Images[J].Computers, Materials & Continua,2021(8):2451-2467. 被引量：1
5Xiaoyan WANG,Xiaoyan CANG,Wenfeng LI,Yinhu LI,Hongli SHAN,Rongyue ZHANG,Yingkun HUANG.Detection,Identification and Phylogenetic Analysis of Sugarcane Pokkah Boeng in Yunnan Sugarcane Areas[J].Agricultural Biotechnology,2022,11(4):22-25.
6Hou-sheng Zhang,Guang-hao Chen,Qin Wu,Biao Huang.Experimental investigation of unsteady attached cavitating flow induced pressure fluctuation[J].Journal of Hydrodynamics,2022,34(1):31-42. 被引量：1
7XIAOLIN ZHU,XIAOHONG WEI,BAOQIANG WANG,XIAN WANG,MINGJUN ZHANG.Identification and analysis of AP2/ERF gene family in tomato underabiotic stress[J].BIOCELL,2020,44(4):777-803. 被引量：2
8Zhao-Xiang Zhang,Bin Luo,Jin Tang,Shan Yu,Amir Hussain.Editorial for Special Issue on Brain-inspired Machine Learning[J].Machine Intelligence Research,2022,19(5):347-349.
9V.Priya,I.Sumaiya Thaseen,Thippa Reddy Gadekallu,Mohamed K.Aboudaif,Emad Abouel Nasr.Robust Attack Detection Approach for IIoT Using Ensemble Classifier[J].Computers, Materials & Continua,2021(3):2457-2470.
10Hai-Kun Liu,Li-Bing Liao,Yuan-Yuan Zhang,Sergey MAksenov,Ning Liu,Qing-Feng Guo,Dina V.Deyneko,Tian-Yi Wang,Le-Fu Mei,Cheng-Hua Sun.Computational analysis of apatite-type compounds for band gap engineering: DFT calculations and structure prediction using tetrahedral substitution[J].Rare Metals,2021,40(12):3694-3700.

Tsinghua Science and Technology

2022年第5期

浏览历史

内容加载中请稍等...

Protein Residue Contact Prediction Based on Deep Learning and Massive Statistical Features from Multi-Sequence Alignment

参考文献1

二级参考文献3

共引文献1

相关作者

相关机构

相关主题

浏览历史