期刊文献+

基于LightGBM的蛋白质类泛素化修饰位点预测

Prediction of protein sumoylation sites based on LightGBM
下载PDF
导出
摘要 蛋白质类泛素化修饰位点的准确识别对基础研究和药物开发都具有重要意义。该文提出了一种基于蛋白质序列特征的类泛素化修饰位点预测模型。该模型结合氨基酸的物理化学属性统计特征和氨基酸序列二元语法模式特征,训练一种轻量型梯度提升机(Light gradient boosting machine,LightGBM)分类器预测某个蛋白质序列的类泛素化修饰位点。该文对比了不同特征的鉴别性,以及不同分类模型的预测性能。在基准数据集上的试验结果证明了该文所提方法的有效性,相比于现有方法在性能上取得了明显的提升,马修斯相关系数为91.64%。 Accurate identification of sumoylation sites of proteins is of great significance for basic research and drug development.In this paper,a sumoylation site prediction model based on protein sequence features is proposed.The model combines the statistical characteristics of physicochemical properties of amino acids and the bi-gram pattern characteristics of amino acid sequences,and trains a light gradient boosting machine(LightGBM)classifier to predict sumoylation sites of a protein sequence.This paper compares the discriminability of different features and the prediction performance of different classification models.Experimental results on benchmark datasets show the effectiveness of the proposed method,and its performance is significantly improved compared with the existing methods,with the Matthews correlation coefficient of 91.64%.
作者 陈焕超 魏志森 於东军 杨敬民 杨静宇 Chen Huanchao;Wei Zhisen;Yu Dongjun;Yang Jingmin;Yang Jingyu(School of Computer Science;Fujian Provincial Universities Key Laboratory of Data Science and Intelligence Application,Minnan Normal University,Zhangzhou 363000,China;School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China)
出处 《南京理工大学学报》 CAS CSCD 北大核心 2022年第2期156-163,共8页 Journal of Nanjing University of Science and Technology
基金 福建省自然科学基金(2020J01813) 福建省教育厅中青年项目(JAT190362)。
关键词 蛋白质翻译后修饰 蛋白质类泛素化修饰位点 基于序列的预测 轻量型梯度提升机 二元语法模式 post-translational modifications protein sumoylation sites sequence-based prediction light gradient boosting machine bi-gram pattern
  • 相关文献

参考文献4

二级参考文献11

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部