摘要
目的N4-乙酰胞苷(ac4C)作为真核mRNA中唯一的乙酰化修饰,可以影响mRNA解码效率,有助于在翻译过程中正确读取密码子,并提高翻译效率和mRNA的稳定性。ac4C已经被证实与多种人类疾病相关,尤其是和癌症也有相关性,但准确判别ac4C修饰得位点仍然有较大困难,所以,提出了一种基于深度学习的方法用于识别ac4C位点的模型。方法用长短时记忆网络和卷积神经网络搭建深度学习模型提取序列中的语义特征,以随机森林作为最终分类器。结果所提出的方法在5倍交叉验证中AUC达0.8796,在独立测试中AUC达0.8718,均达到一个较好的结果。在5倍交叉验证中灵敏度达0.6491,在独立测试中灵敏度达0.6567,超过了最先进方法的灵敏度。结论与使用的传统特征进行了对比,语义特征具有更好的性能,有助于更准确识别ac4C的修饰位点。
As the only acetylation modification in eukaryotic mRNA,N4 acetylcytidine(ac4C)can affect the mRNA decoding efficiency,help to correctly read the codon in the translation process,and improve the translation efficiency and mRNA stability.It has been proved to be associated with a variety of human diseases,especially cancer.However,it is still difficult to accurately identify ac4C modified sites,so a model based on deep learning was proposed to identify ac4C sites.For this method,the long-short-term memory network and convolution neural network were used to build a deep learning model to extract semantic features from the sequence,and the random forest was used as the final classifier.The AUC of the proposed method reached 0.8796 in the 5-fold cross-validation and 0.8718 in the independent test,both of which achieved a good result.The sensitivity reached 0.6491 in the 5-fold cross-validation and 0.6567 in the independent tests,both of which surpassed the most advanced methods.Compared with the traditional features used,semantic features have better performance and help to identify ac4C decorative sites more accurately.
作者
郑杨
ZHENG Yang(School of Food and Chemical Engineering,Shaoyang University,Shaoyang 422000,China)
出处
《邵阳学院学报(自然科学版)》
2021年第6期78-87,共10页
Journal of Shaoyang University:Natural Science Edition
基金
湖南省研究生创新项目(CX2020SY059)。