期刊文献+

Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites 被引量:5

Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
原文传递
导出
摘要 As a newly-identified protein post-translational modification, malonylation is involved in a variety of biological functions. Recognizing malonylation sites in substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein malonylation. In this study, we constructed a deep learning(DL) network classifier based on long short-term memory(LSTM) with word embedding(LSTMWE) for the prediction of mammalian malonylation sites.LSTMWEperforms better than traditional classifiers developed with common pre-defined feature encodings or a DL classifier based on LSTM with a one-hot vector. The performance of LSTMWE is sensitive to the size of the training set, but this limitation can be overcome by integration with a traditional machine learning(ML) classifier. Accordingly, an integrated approach called LEMP was developed, which includes LSTMWEand the random forest classifier with a novel encoding of enhanced amino acid content. LEMP performs not only better than the individual classifiers but also superior to the currently-available malonylation predictors. Additionally, it demonstrates a promising performance with a low false positive rate, which is highly useful in the prediction application. Overall, LEMP is a useful tool for easily identifying malonylation sites with high confidence.LEMP is available at http://www.bioinfogo.org/lemp. As a newly-identified protein post-translational modification, malonylation is involved in a variety of biological functions. Recognizing malonylation sites in substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein malonylation. In this study, we constructed a deep learning(DL) network classifier based on long short-term memory(LSTM) with word embedding(LSTMWE) for the prediction of mammalian malonylation sites.LSTMWEperforms better than traditional classifiers developed with common pre-defined feature encodings or a DL classifier based on LSTM with a one-hot vector. The performance of LSTMWE is sensitive to the size of the training set, but this limitation can be overcome by integration with a traditional machine learning(ML) classifier. Accordingly, an integrated approach called LEMP was developed, which includes LSTMWEand the random forest classifier with a novel encoding of enhanced amino acid content. LEMP performs not only better than the individual classifiers but also superior to the currently-available malonylation predictors. Additionally, it demonstrates a promising performance with a low false positive rate, which is highly useful in the prediction application. Overall, LEMP is a useful tool for easily identifying malonylation sites with high confidence.LEMP is available at http://www.bioinfogo.org/lemp.
出处 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2018年第6期451-459,共9页 基因组蛋白质组与生物信息学报(英文版)
基金 supported in part by funds from the Young Scientists Fund of the National Natural Science Foundation of China (Grant No.31701142 to ZC Grant No.81602621 to NH) the Qingdao Postdoctoral Science Foundation (Grant No.2016061 to NH) the Shandong Provincial Natural Science Foundation (Grant No.ZR2016CM14 to LL) the National Natural Science Foundation of China (Grant No.31770821 to LL) supported by the ‘‘Distinguished Expert of Overseas Tai Shan Scholar" program
关键词 Deep learning Recurrent neural network LSTM Malonylation Random forest Deep learning Recurrent neural network LSTM Malonylation Random forest
  • 相关文献

参考文献1

共引文献8

同被引文献13

引证文献5

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部