Transcripts are expressed spatially and temporally and they are very complicated, precise and specific; however, most studies are focused on protein-coding related genes. Recently, massively parallel c DNA sequencing(...Transcripts are expressed spatially and temporally and they are very complicated, precise and specific; however, most studies are focused on protein-coding related genes. Recently, massively parallel c DNA sequencing(RNA-seq) has emerged to be a new and promising tool for transcriptome research, and numbers of non-coding RNAs, especially linc RNAs, have been widely identified and well characterized as important regulators of diverse biological processes. In this study, we used ultra-deep RNA-seq data from 15 mouse tissues to study the diversity and dynamic of non-coding RNAs in mouse. Using our own criteria, we identified totally 16,249 non-coding genes(21,569 non-coding RNAs) in mouse. We annotated these non-coding RNAs by diverse properties and found non-coding RNAs are generally shorter, have fewer exons, express in lower level and are more strikingly tissue-specific compared with protein-coding genes. Moreover, these non-coding RNAs show significant enrichment with transcriptional initiation and elongation signals including histone modifications(H3K4me3, H3K27me3 and H3K36me3), RNAPII binding sites and CAGE tags. The gene set enrichment analysis(GSEA) result revealed several sets of linc RNAs associated with diverse biological processes such as immune effector process, muscle development and sexual reproduction. Taken together, this study provides a more comprehensive annotation of mouse non-coding RNAs and gives an opportunity for future functional and evolutionary study of mouse non-coding RNAs.展开更多
基金supported by grants from Natural Science Foundation of China (31271385)Knowledge Innovation Program of the Chinese Academy of Sciences (KSCX2-EW-R-01-04)
文摘Transcripts are expressed spatially and temporally and they are very complicated, precise and specific; however, most studies are focused on protein-coding related genes. Recently, massively parallel c DNA sequencing(RNA-seq) has emerged to be a new and promising tool for transcriptome research, and numbers of non-coding RNAs, especially linc RNAs, have been widely identified and well characterized as important regulators of diverse biological processes. In this study, we used ultra-deep RNA-seq data from 15 mouse tissues to study the diversity and dynamic of non-coding RNAs in mouse. Using our own criteria, we identified totally 16,249 non-coding genes(21,569 non-coding RNAs) in mouse. We annotated these non-coding RNAs by diverse properties and found non-coding RNAs are generally shorter, have fewer exons, express in lower level and are more strikingly tissue-specific compared with protein-coding genes. Moreover, these non-coding RNAs show significant enrichment with transcriptional initiation and elongation signals including histone modifications(H3K4me3, H3K27me3 and H3K36me3), RNAPII binding sites and CAGE tags. The gene set enrichment analysis(GSEA) result revealed several sets of linc RNAs associated with diverse biological processes such as immune effector process, muscle development and sexual reproduction. Taken together, this study provides a more comprehensive annotation of mouse non-coding RNAs and gives an opportunity for future functional and evolutionary study of mouse non-coding RNAs.