Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites 被引量：5

Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites

导出

摘要 As a newly-identified protein post-translational modification, malonylation is involved in a variety of biological functions. Recognizing malonylation sites in substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein malonylation. In this study, we constructed a deep learning(DL) network classifier based on long short-term memory(LSTM) with word embedding(LSTMWE) for the prediction of mammalian malonylation sites.LSTMWEperforms better than traditional classifiers developed with common pre-defined feature encodings or a DL classifier based on LSTM with a one-hot vector. The performance of LSTMWE is sensitive to the size of the training set, but this limitation can be overcome by integration with a traditional machine learning(ML) classifier. Accordingly, an integrated approach called LEMP was developed, which includes LSTMWEand the random forest classifier with a novel encoding of enhanced amino acid content. LEMP performs not only better than the individual classifiers but also superior to the currently-available malonylation predictors. Additionally, it demonstrates a promising performance with a low false positive rate, which is highly useful in the prediction application. Overall, LEMP is a useful tool for easily identifying malonylation sites with high confidence.LEMP is available at http://www.bioinfogo.org/lemp. As a newly-identified protein post-translational modification, malonylation is involved in a variety of biological functions. Recognizing malonylation sites in substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein malonylation. In this study, we constructed a deep learning(DL) network classifier based on long short-term memory(LSTM) with word embedding(LSTMWE) for the prediction of mammalian malonylation sites.LSTMWEperforms better than traditional classifiers developed with common pre-defined feature encodings or a DL classifier based on LSTM with a one-hot vector. The performance of LSTMWE is sensitive to the size of the training set, but this limitation can be overcome by integration with a traditional machine learning(ML) classifier. Accordingly, an integrated approach called LEMP was developed, which includes LSTMWEand the random forest classifier with a novel encoding of enhanced amino acid content. LEMP performs not only better than the individual classifiers but also superior to the currently-available malonylation predictors. Additionally, it demonstrates a promising performance with a low false positive rate, which is highly useful in the prediction application. Overall, LEMP is a useful tool for easily identifying malonylation sites with high confidence.LEMP is available at http://www.bioinfogo.org/lemp.

作者 Zhen Chen Ningning He Yu Huang Wen Tao Qin Xuhan Liu Lei Li

机构地区 School of Basic Medicine School of Data Science and Software Engineering Department of Biochemistry Department of Information Technology Qingdao Cancer Institute

出处《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2018年第6期451-459,共9页 基因组蛋白质组与生物信息学报（英文版）

基金 supported in part by funds from the Young Scientists Fund of the National Natural Science Foundation of China (Grant No.31701142 to ZC Grant No.81602621 to NH) the Qingdao Postdoctoral Science Foundation (Grant No.2016061 to NH) the Shandong Provincial Natural Science Foundation (Grant No.ZR2016CM14 to LL) the National Natural Science Foundation of China (Grant No.31770821 to LL) supported by the ‘‘Distinguished Expert of Overseas Tai Shan Scholar" program

关键词 Deep learning Recurrent neural network LSTM Malonylation Random forest Deep learning Recurrent neural network LSTM Malonylation Random forest

分类号 Q811.4 [生物学—生物工程] TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献1

1Haodong Xu,Jiaqi Zhou,Shaofeng Lin,Wankun Deng,Ying Zhang,Yu Xue.PLMD：An updated data resource of protein lysine modifications[J].Journal of Genetics and Genomics,2017,44(5):243-250. 被引量：9

共引文献8

1Yuan Lv,Chen Bu,Jin Meng,Carl Ward,Giacomo Volpe,Jieyi Hu,Mengling Jiang,Lin Guo,Jiekai Chen,Miguel A.Esteban,Xichen Bao,Zhongyi Cheng.Global Profiling of the Lysine Crotonylome in Different Pluripotent States[J].Genomics, Proteomics & Bioinformatics,2021,19(1):80-93. 被引量：1
2Wanshan Ning,Haodong Xu,Peiran Jiang,Han Cheng,Wankun Deng,Yaping Guo,Yu Xue.HybridSucc:A Hybrid-learning Architecture for General and Species-specific Succinylation Site Prediction[J].Genomics, Proteomics & Bioinformatics,2020,18(2):194-207.
3王丽娜,汪敬琳.基于多特征优化算法的丁酰化修饰位点计算分类[J].湖北文理学院学报,2021,42(2):16-20.
4Liya Zhu,Han Cheng,Guoqing Peng,Shuansuo Wang,Zhiguo Zhang,Erdong Ni,Xiangdong Fu,Chuxiong Zhuang,Zexian Liu,Hai Zhou.Ubiquitinome Profiling Reveals the Landscape of Ubiquitination Regulation in Rice Young Panicles[J].Genomics, Proteomics & Bioinformatics,2020,18(3):305-320. 被引量：3
5颜志良,丰智鹏,刘丹,王会青.一种混合深度神经网络的赖氨酸乙酰化位点预测方法[J].南京大学学报（自然科学版）,2021,57(4):627-640.
6孙园园,雷哲,曹盼盼,高洁,路宏朝.非组蛋白甲基转移酶样21(METTL21)研究进展[J].生命的化学,2022,42(7):1337-1343.
7Yuanliang Yan,Shangjun Zhou,Xi Chen,Qiaoli Yi,Songshan Feng,Zijin Zhao,Yuanhong Liu,Qiuju Liang,Zhijie Xu,Zhi Li,Lunquan Sun.Suppression of ITPKB degradation by Trim25 confers TMZ resistance in glioblastoma through ROS homeostasis[J].Signal Transduction and Targeted Therapy,2024,9(4):1637-1650.
8Fei Xu,Han Chen,Changyi Zhou,Tongtong Zang,Rui Wang,Shutong Shen,Chaofu Li,Yue Yu,Zhiqiang Pei,Li Shen,Juying Qian,Junbo Ge.Targeting deubiquitinase OTUB1 protects vascular smooth muscle cells in atherosclerosis by modulating PDGFRβ[J].Frontiers of Medicine,2024,18(3):465-483.

同被引文献13

1夏军,王渺林.长江上游流域径流变化与分布式水文模拟[J].资源科学,2008,30(7):962-967. 被引量：47
2龙鑫,苏寒松,刘高华,陈震宇.一种基于角度距离损失函数和卷积神经网络的人脸识别算法[J].激光与光电子学进展,2018,55(12):402-413. 被引量：19
3刘尹霞,李斯,王东,张慧博.煤炭传输带边缘磨损机器视觉无损在线检测[J].露天采矿技术,2015,30(5):39-41. 被引量：2
4刘波.矿用输送带纵撕图像检测和纵撕故障识别方法[J].煤矿机械,2018,39(5):144-146. 被引量：7
5曲建岭,余路,袁涛,田沿平,高峰.基于一维卷积神经网络的滚动轴承自适应故障诊断算法[J].仪器仪表学报,2018,39(7):134-143. 被引量：225
6佘博,田福庆,梁伟阁.基于深度卷积变分自编码网络的故障诊断方法[J].仪器仪表学报,2018,39(10):27-35. 被引量：36
7蒋爱国,符培伦,谷明,王金江.基于多模态堆叠自动编码器的感应电机故障诊断[J].电子测量与仪器学报,2018,32(8):17-23. 被引量：17
8包萍,刘运节.不均衡数据集下基于生成对抗网络的改进深度模型故障识别研究[J].电子测量与仪器学报,2019,31(3):176-183. 被引量：15
9唐士宇,朱艾春,张赛,曹青峰,崔冉,华钢.基于深度卷积神经网络的井下人员目标检测[J].工矿自动化,2018,44(11):32-36. 被引量：9
10王华庆,任帮月,宋浏阳,董方,王梦阳.基于终止准则改进K-SVD字典学习的稀疏表示特征增强方法[J].机械工程学报,2019,55(7):35-43. 被引量：14

引证文献5

1杨建伟,涂兴子,梅峰漳,李亚宁,范鑫杰.基于深度学习优化YOLOV3算法的芳纶带检测算法研究[J].中国矿业,2020,29(4):67-72. 被引量：3
2魏欣,贾建华.基于集成支持向量机的蛋白质K丙二酰化位点的预测[J].景德镇学院学报,2021,36(3):81-84. 被引量：1
3熊一橙,徐炜,张―锐,侯家其,杨雨霞,许莎莎.基于LSTM网络的长江上游流域径流模拟研究[J].水电能源科学,2021,39(9):22-24. 被引量：15
4卞佳豪,杨广宇.人工智能辅助的蛋白质工程[J].合成生物学,2022,3(3):429-444. 被引量：7
5宋浏阳,李石,王芃鑫,王华庆.基于动态统计滤波与深度学习的智能故障诊断方法[J].仪器仪表学报,2019,40(7):39-46. 被引量：16

二级引证文献42

1全智,顾一帆.基于深度学习的射频电路空间辐射测试系统[J].仪器仪表学报,2022,43(12):248-257. 被引量：2
2李元,张昊展,唐晓初.基于多模态数据全信息的概率主成分分析故障检测研究[J].仪器仪表学报,2021,42(2):75-85. 被引量：16
3姜阔胜,徐瑞,王迪.基于深度学习的铜封帽内螺纹缺陷检测研究[J].安徽理工大学学报（自然科学版）,2022,42(3):93-98. 被引量：1
4贾振卿,刘雪峰.基于YOLO和图像增强的海洋动物目标检测[J].电子测量技术,2020,43(14):84-88. 被引量：8
5乔文超,王红雨,王鸿东.基于BP神经网络的无人机IMU多传感器冗余的补偿算法[J].电子测量与仪器学报,2020,32(12):19-28. 被引量：22
6郭煜.深度学习网络的偏微分方程高精度求解研究[J].国外电子测量技术,2021,40(1):75-79.
7张龙,徐天鹏,王朝兵,吴荣真,甄灿壮,闫乐玮.基于改进谱峭度与一维卷积神经网络的轴承故障诊断[J].机械设计与研究,2021,37(4):99-105. 被引量：14
8王璇,王衍学.修正的潜在结构正交投影的过程监控[J].电子测量与仪器学报,2021,35(7):90-97. 被引量：2
9马晨凯,吴毅慧,傅华奇,业宁.基于深度学习的先进陶瓷零件实时缺陷检测系统[J].南京航空航天大学学报,2021,53(5):726-734. 被引量：9
10卢芳革.高压电气设备局部故障智能诊断系统探究[J].广西教育,2021(27):173-176. 被引量：2

1Hanxiao Sun,Meiling Zhang,Kai Li,Dongsheng Bai,Chengqi Yi.Cap-specific, terminal N^6-methylation by a mammalian m6Am methyltransferase[J].Cell Research,2019,29(1):80-82. 被引量：9
2张早.数据新闻写作模式探析——以RUC新闻坊数据新闻为例[J].视听,2019(4):185-186. 被引量：1
3苏殿三.Effects of hemodilution on neurological injury and cerebral amino acid content after circulatory arrest during profound hypothermia in rats[J].外科研究与新技术,2005(3):157-158.
4Jonathan Allcock,Shengyu Zhang.Quantum machine learning[J].National Science Review,2019,6(1):26-28. 被引量：2
5“第三届许渊冲翻译大赛”英译汉原文（英文）[J].外语学刊,2019,0(2).
6Zheng Sun,Lingling Zhao,Ni Cheng,Xiaofeng Xue,Liming Wu,Jianbin Zheng,Wei Cao.Identification of botanical origin of Chinese unifloral honeys by free amino acid profiles and chemometric methods[J].Journal of Pharmaceutical Analysis,2017,7(5):317-323. 被引量：9
7Zhaoqin WANG,Yuping WAN,Xiaosheng WU,Yu ZHANG,Fangfang JIA,Guangyao HAN,Zhengxue PENG,Fangyang HE.Study on Colloidal Gold Immunochromatography Assay for Rapid Detection of Spectinomycin[J].Agricultural Biotechnology,2019,8(1):188-189. 被引量：3
8Yuxin MIAO,David J.MULLA,Pierre C.ROBERT.An integrated approach to site-specific management zone delineation[J].Frontiers of Agricultural Science and Engineering,2018,5(4):432-441. 被引量：1
9胡晗,杨伟,侯冬梅.琴键堰水力特性数值模拟[J].长江科学院院报,2019,36(4):60-66. 被引量：8
10LIN Bin.Differentiation of pulmonary mucosa-associated lymphoid tissue lymphoma and pulmonary adenocarcinoma by radiomics[J].China Medical Abstracts(Internal Medicine),2018,35(4):195-195.

Genomics, Proteomics & Bioinformatics

2018年第6期

浏览历史

内容加载中请稍等...