摘要
长链非编码RNA(Long non-codingRNA,lncRNA)是一类长度在200 nt以上的非编码RNA。它不具备蛋白质编码功能,但在生物体中以RNA分子的形式参与众多生物过程。利用实验室前期获得的陆地棉茎尖转录组数据,鉴定了8044条lncRNA,其中3691条分布于At亚组,2852条分布于Dt亚组;通过基因组共定位、碱基互补配对等生物信息学方法对其中2227条lncRNA进行功能注释,其中1875条位于编码基因上下游,可能通过与编码基因的顺式作用元件或3’UTR区结合,在转录或者转录后水平调控基因表达;317条反义lncRNA与正义链的mRNA存在互作,通过碱基互补配对调控基因沉默、转录及mRNA的稳定性;20条lncRNA预测为microRNA的前体;5条lncRNA注释到4个lncRNA家族。功能注释表明,这些lncRNA主要参与转录调节、代谢、激素应答和信号转导等生物过程。本研究为充分利用高通量测序数据研究棉花lncRNA提供了新思路。
Long non-coding RNAs (lncRNAs) are a class of transcripts longer than 200 nucleotides with no protein-coding potential that are involved in various biological processes. In this study, we identified 8044 lncRNAs from shoot apical RNA-seq data generated in our laboratory using coding potential calculator software. Among them, 3691 lncRNAs were mapped to the Atsubgenome of allotetraploid cotton, and 2852 to the Dt subgenome. A total of 2227 lncRNAs were functionally annotated using bioinformatics methods such as genomic co-location, complementary base-pairing, pre-miRNA prediction, and IncRNA family prediction. Of the 2227 lncRNAs, 1875 were mapped up/downstream of coding genes, 317 antisense lncRNAswere predicted to interact with mRNAs by complementary base-pairing, 20 lncRNAs were predicted to be microRNA precursors, and fivelncR- NAs were annotated to four lncRNA families. These lncRNAs were predicted to be involved in biological processes such as transcriptional regulation, metabolism, hormone responses, and signal transduction. This study provides a new approach to studying the function of lncRNAs using high throughput cotton sequencing data.
出处
《棉花学报》
CSCD
北大核心
2016年第5期470-477,共8页
Cotton Science
基金
国家现代农业产业技术体系--棉花产业技术体系(CARS-18-10)
泰山学者建设工程专项(NO.ts201511070)
山东省农业科学院科技创新重点项目(2014CXZ10-3)
关键词
棉花
茎尖转录组
长链非编码RNA
鉴定
功能预测
cotton
shoot apical RNA-seq data
long non-coding RNA
identification
functional prediction