摘要
为挖掘甜橙(Citrus sinensis)基因组中的环状RNA(circular RNA,circRNA),明确circRNA在甜橙与病原菌互作过程中的生物学功能,本研究基于机器学习随机森林模型,利用python环境开发了针对甜橙circRNA鉴定的流程,比较不同建模算法的优劣,鉴定甜橙基因组中的circRNA,构建甜橙circRNA-miRNA及circRNA-miRNA-mRNA互作网络,并对靶向mRNA进行基因功能富集。通过比较随机森林、决策树以及前馈神经网络3种建模算法,结果表明,基于随机森林算法构建的模型性能最好。共鉴定了2523个甜橙circRNA,它们不均匀地分布在9条染色体上,其中5号染色体分布最多,有416个;存在606个甜橙circRNA-miRNA互作对及21043个miRNA-mRNA互作对;靶向mRNA基因功能广泛参与代谢、转运及发育等过程,涉及苯丙烷类物质生物合成、亚油酸代谢和植物-病原菌互作等代谢途径;甜橙circRNA影响miR172和miR482等抗病相关小RNA的转录调控。本研究为甜橙circRNA参与抗病生物学过程的研究提供参考。
To identify circular RNA(circRNA)in Citrus sinensis genome and analyze their biological functions in the process of interaction between C.sinensis and pathogen,a procedure for identification of circRNA was developed in python environment based on themachine learning random forest model.After comparison of different machine learning models,identification of circRNA in C.sinensis,construction of interaction network of circRNA-miRNA and circRNA-miRNA-mRNA,and gene functional enrichment analysis ofcircRNA-related mRNA,our results indicated that best performance was observed using random forest model compared with decisiontree and feedforward neural network models.A total of 2523 circRNA were identified in C.sinensis and they distributed unevenly on thenine chromosomes of C.sinensis as the chromosome 5 containing the maximum number with 416 circRNA.606 circRNA-miRNA and 21043 miRNA-mRNA interaction pairs were predicted and the gene function of targeted mRNA involved in metabolism,transport and development process including phenylpropanoid biosynthesis,linoleic acid metabolism and plant-pathogen interaction.The transcriptionalregulations of disease related miRNA like Csi-miR172 and Csi-miR482 were influenced by circRNA in C.sinensis.This study providedclues for identification and analysis of circRNA involvement in disease resistance biological process in C.sinensis.
作者
刘畅
闫亚娜
黄桂艳
李瑞民
LIU Chang;YAN Yanan;HUANG Guiyan;LI Ruimin(College of Life Sciences,Gannan Normal University,Ganzhou,341000)
出处
《基因组学与应用生物学》
CAS
CSCD
北大核心
2024年第2期250-260,共11页
Genomics and Applied Biology
基金
国家自然科学基金项目(32260659)
江西省教育厅项目(GJJ201432)共同资助。
关键词
甜橙
环状RNA
随机森林模型
靶基因
转录调控
Citrus sinensis
circRNA
Random forest model
Target genes
Transcriptional regulation