摘要
miRNA与其靶基因的作用机制十分复杂,因此,miRNA靶基因识别问题一直是miRNA研究领域的热点难题。该文基于CLASH数据集,提出了miRNA-靶位点配对序列特征,并使用随机森林建模。实验结果表明,本模型的Acc,Sen,Spe,Pre以及Mcc分别达到90.05%,89.47%,90.56%,90.43%和0.799 8;ROC和PRC的AUC分别为0.954,0.958。相比已有方法,该方法表现出更加良好的性能,说明新引入的miRNA-靶位点配对序列特征对miRNA靶基因识别有很重大的影响。
The mechanisms underlying the interaction of miRNAs with their mRNA targets are quite complex, which makes miRNA target prediction be a hot issue in the field of miRNA research. The features of miRNA-target sites pairing sequences were proposed based on the CLASH dataset, and the random forest models were applied for modeling. The average values of Acc, Sen, Spe, Pre and Mcc are 90.05% . 89.47% 90.56% , 90.43% and 0. 799 8, respectively, and the AUC of ROC and PRC are 0. 954 and 0. 958, respectively. The results indicated that the current method shows a better performance compared with the existed methods, and the features newly constructed have a very significant impact on the identification of miRNA target genes.
出处
《分析测试学报》
CAS
CSCD
北大核心
2017年第5期614-620,共7页
Journal of Instrumental Analysis
基金
国家自然科学基金项目(21675180)
广东省科技计划项目(2014A040401022
2015A030401033
2016B010108007)
广州市科技计划项目(201604020145)