N^(6)-Methyladenine is a dynamic and reversible post translational modification,which plays an essential role in various biological processes.Because of the current inability to identify m6A-containing mRNAs,computati...N^(6)-Methyladenine is a dynamic and reversible post translational modification,which plays an essential role in various biological processes.Because of the current inability to identify m6A-containing mRNAs,computational approaches have been developed to identify m6A sites in DNA sequences.Aiming to improve prediction performance,we introduced a novel ensemble computational approach based on three hybrid deep neural networks,including a convolutional neural network,a capsule network,and a bidirectional gated recurrent unit(BiGRU)with the self-attention mechanism,to identify m6A sites in four tissues of three species.Across a total of 11 datasets,we selected different feature subsets,after optimized from 4933 dimensional features,as input for the deep hybrid neural networks.In addition,to solve the deviation caused by the relatively small number of experimentally verified samples,we constructed an ensemble model through integrating five sub-classifiers based on different training datasets.When compared through 5-fold cross-validation and independent tests,our model showed its superiority to previous methods,im6A-TS-CNN and iRNA-m6A.展开更多
基金supported by the National Natural Science Foundation of China(Nos.62071079 and 61803065).
文摘N^(6)-Methyladenine is a dynamic and reversible post translational modification,which plays an essential role in various biological processes.Because of the current inability to identify m6A-containing mRNAs,computational approaches have been developed to identify m6A sites in DNA sequences.Aiming to improve prediction performance,we introduced a novel ensemble computational approach based on three hybrid deep neural networks,including a convolutional neural network,a capsule network,and a bidirectional gated recurrent unit(BiGRU)with the self-attention mechanism,to identify m6A sites in four tissues of three species.Across a total of 11 datasets,we selected different feature subsets,after optimized from 4933 dimensional features,as input for the deep hybrid neural networks.In addition,to solve the deviation caused by the relatively small number of experimentally verified samples,we constructed an ensemble model through integrating five sub-classifiers based on different training datasets.When compared through 5-fold cross-validation and independent tests,our model showed its superiority to previous methods,im6A-TS-CNN and iRNA-m6A.