基于对比学习的蛋白质功能预测模型

Protein Function Prediction Model Based on Contrastive Learning

下载PDF

导出

摘要确定蛋白质的功能可为许多生物学问题的解决提供支持,目前已提出了多种机器学习方法来对蛋白质的功能进行预测.大多数方法利用蛋白质序列、结构域以及蛋白质间的相互作用等特征来预测蛋白质功能,然而对于新发现的蛋白质来说,除序列外其他特征很难获得,因此仅通过序列对蛋白质进行功能预测的方法显得十分有意义.为此提出了一种基于序列对蛋白质进行功能预测的模型,该模型在对比学习的框架下进一步挖掘蛋白质序列的信息,并且还有效利用了蛋白质功能标签之间存在的共现关系.实验结果表明提出的模型能提高蛋白质功能的预测效果良好. Determining the function of proteins can provide significant support for resolving numerous biological problems,and several machine learning methods have been proposed to predict protein function,primarily utilizing features such as protein sequence,structural domains,and protein-protein interactions.However,for newly discovered proteins,obtaining features other than the sequence can be difficult.Therefore,sequence-based methods for protein function prediction hold significant value.To this end,this paper proposes a model for predicting sequence-based protein function.The model further explores the information of protein sequences within the framework of contrastive learning and also effectively leverages the co-occurrence relationship between protein functional labels.Experimental results demonstrate the excellent predictive perform-ance of the proposed model for protein function.

作者孙旭林劼 SUN Xu;LIN Jie(School of Mathematics and Statistics,Fujian Normal University,Fuzhou 350117,China;College of Computer and Cyber Security,Fujian Normal University,Fuzhou 350117,China)

机构地区福建师范大学数学与统计学院福建师范大学计算机与网络空间安全学院

出处《福建师范大学学报（自然科学版）》 CAS 2023年第6期32-39,共8页 Journal of Fujian Normal University：Natural Science Edition

基金国家自然科学基金资助项目(61472082)。

关键词对比学习基因本体标签传播 contrastive learning gene ontology label propagation

分类号 Q811.4 [生物学—生物工程]

引文网络
相关文献

参考文献2

1李绍新,张延娇.改进的遗传算法在蛋白质结构预测中的应用[J].华南师范大学学报（自然科学版）,2009,41(1):56-60. 被引量：6
2王皓白,沈昕,黄尉健,陈可佳.Protein-HVGAE:一种双曲空间中的蛋白质编码方法[J].计算机科学与探索,2023,17(3):701-708. 被引量：1

二级参考文献17

1罗兵,陈恒法,邓虹.基于遗传优化的图像增强模糊算法[J].华南师范大学学报（自然科学版）,2007,39(1):32-36. 被引量：4
2BAKER D, SALI A. Protein structure prediction and structural genomics[ J]. Science, 2001,294:93 - 96.
3ANFINSEN C. Principles that govern the folding of protein chains[J]. Science, 1973, 181:223-230.
4HART W E, ISTRAIL S. Robust proofs of NP - Hardness for protein folding general lattices and energy potentials [ J ]. Journal of Computational Biology, 1997, 4( 1 ) :1 -22.
5BERGER B, LEIGHTON T. Protein folding in the hydrophpilic- hydrophobie model is NP- complete [ J ]. Journal of Computational Biology, 1998, 5 ( 1 ) :27 -40.
6DAVID M, Webster. Protein Structure Prediction Method and Protocol[ M]. Totowa, New Jersey: Humana Press, 2000.
7COVELL D G, JERNIGAN R L. Conformation of folded proteins in restricted spaces [J]. Biochemistry, 1990, 29:3287 - 3294.
8JIANG T Q, CUI G S, MA S. Protein folding simulations of the hydrophobic - hydrophilic model by combining tabu search with genetic algorithms [ J ]. Journal of Chemical Physics, 2003, 119 (8): 4592-4595.
9NEWMAN A, RUHL M. Combinatorial problems on strings with applications to protein folding[ C]//Proceedings of the Sixth Latin American Symposium on Theoretical Informatics, Buenos Aires : Springer - Verlag, 2004 : 369 - 378.
10LAU K F, DILL K A. A lattice statistical mechanics model of the conformational and sequence space of proteins[ J]. Macromolecules, 1989, 22 : 3986 - 3997.

共引文献5

1熊芳敏,岑宇森,曾碧卿.运用蚁群算法解决物流中心拣货路径问题[J].华南师范大学学报（自然科学版）,2010,42(2):50-54. 被引量：10
2焉为家,郭雨珍.改进的粒子群算法求解蛋白质结构预测问题[J].计算机技术与发展,2011,21(12):109-112. 被引量：3
3张毅,梅挺.基于加权决策树的蛋白质序列分类算法研究[J].计算机与数字工程,2012,40(5):7-9. 被引量：3
4徐承爱,林伟,肖红.一种基于加权海明距离的自适应遗传算法[J].华南师范大学学报（自然科学版）,2015,47(6):121-127. 被引量：10
5夏慧芳,郭雨珍,江宏昊.基于遗传算法预测2D三向的蛋白质结构[J].生物信息学,2019,17(1):24-30.

1庄子,孟雨桐,刘润旸,焦馨怡,芦娟,尹军,张文,杨世兴,沈权.纳米孔测序技术及其在病原学诊断中的应用进展[J].江苏大学学报（医学版）,2023,33(6):502-508. 被引量：3
2秦琪琪,丁学明,王金雷.利用序列和组合图卷积网络预测蛋白质功能[J].小型微型计算机系统,2023,44(12):2692-2699. 被引量：1
3张哲,王蓓,饶利兵,林玲.RNA结合蛋白人类抗原R与冠心病关系的研究进展[J].医学综述,2023,29(21):4458-4464.
4杨迪,谷绪国,刘兴巧,张军,刘天天,王新军,张乃群,杨成雨.基于文献计量学的猕猴桃研究可视化分析[J].南阳师范学院学报,2023,22(6):58-63.
5王方军,刘哲益,殷志斌,罗盼,杨诗蕊,肖春雷,杨学明.基于先进紫外光源的生物质谱技术的最新进展及展望[J].中国科学：化学,2023,53(11):2290-2303.
6朱湘玉,曹培暄,朱雨捷,李洁.IRF6基因变异所致van der Woude综合征一个家系的遗传学分析[J].中华医学遗传学杂志,2023,40(12):1517-1520.
7李艳,陈艳艳,华夏,方宇辉,王玉民,巩晨,齐学礼,胡琳.利用CRISPR/Cas9技术创制小麦耐低肥Taaap3突变体[J].河南农业科学,2023,52(11):33-41. 被引量：1
8张振煜,汤镇霖,张朝晖,张绪良.基于文献计量学的国家湿地公园研究[J].生态学报,2023,43(22):9555-9563. 被引量：9

福建师范大学学报（自然科学版）

2023年第6期

浏览历史

内容加载中请稍等...

基于对比学习的蛋白质功能预测模型

参考文献2

二级参考文献17

共引文献5

相关作者

相关机构

相关主题

浏览历史