近年来,基于单分子测序技术的ISO-seq数据以其超长读段长度被越来越多地应用于转录组新型异构体预测研究,但目前大多数研究工作只用到全长读段数据,丢失了非全长读段数据中较多有用信息,因而数据没有得到充分利用。针对这一问题,本文在...近年来,基于单分子测序技术的ISO-seq数据以其超长读段长度被越来越多地应用于转录组新型异构体预测研究,但目前大多数研究工作只用到全长读段数据,丢失了非全长读段数据中较多有用信息,因而数据没有得到充分利用。针对这一问题,本文在保留非全长读段的基础上提出了两个能同时预测异构体结构和计算其表达比例的模型基于狄利克雷采样的异构体探测与预测(Dirichletsampling for isoform detection and prediction,DSIDP)和基于马尔科夫链的异构体探测与预测(Markovchain for isoform detection and predition,MCIDP)。两个模型均从全长读段中建立异构体预测集,并采用全长读段和非全长读段计算异构体表达比例。DSIDP将所有读段比对至异构体预测集,并使用Dirichlet采样解决多源映射问题,MCIDP使用马尔科夫链模拟基因外显子之间的选择性剪切,该模型还能预测出数据中没有全长读段的异构体。本文采用模拟数据和真实数据验证了两个模型的有效性。展开更多
The transcriptome serves as a bridge that links genomic variation to phenotypic diversity.A vast number of studies using next-generation RNA sequencing(RNA-seq)over the last 2 decades have emphasized the essential rol...The transcriptome serves as a bridge that links genomic variation to phenotypic diversity.A vast number of studies using next-generation RNA sequencing(RNA-seq)over the last 2 decades have emphasized the essential roles of the plant transcriptome in response to developmental and environmental conditions,providing numerous insights into the dynamic changes,evolutionary traces,and elaborate regulation of the plant transcriptome.With substantial improvement in accuracy and throughput,direct RNA sequencing(DRS)has emerged as a new and powerful sequencing platform for precise detection of native and full-length transcripts,overcoming many limitations such as read length and PCR bias that are inherent to short-read RNA-seq.Here,we review recent advances in dissecting the complexity and diversity of plant transcriptomes using DRS as the main technological approach,covering many aspects of RNA metabolism,including novel isoforms,poly(A)tails,and RNA modification,and we propose a comprehensive workflow for processing of plant DRS data.Many challenges to the application of DRS in plants,such as the need for machine learning tools tailored to plant transcriptomes,remain to be overcome,and together we outline future biological questions that can be addressed by DRS,such as allele-specific RNA modification.This technology provides convenient support on which the connection of distinct RNA features is tightly built,sustainably refining our understanding of the biological functions of the plant transcriptome.展开更多
文摘近年来,基于单分子测序技术的ISO-seq数据以其超长读段长度被越来越多地应用于转录组新型异构体预测研究,但目前大多数研究工作只用到全长读段数据,丢失了非全长读段数据中较多有用信息,因而数据没有得到充分利用。针对这一问题,本文在保留非全长读段的基础上提出了两个能同时预测异构体结构和计算其表达比例的模型基于狄利克雷采样的异构体探测与预测(Dirichletsampling for isoform detection and prediction,DSIDP)和基于马尔科夫链的异构体探测与预测(Markovchain for isoform detection and predition,MCIDP)。两个模型均从全长读段中建立异构体预测集,并采用全长读段和非全长读段计算异构体表达比例。DSIDP将所有读段比对至异构体预测集,并使用Dirichlet采样解决多源映射问题,MCIDP使用马尔科夫链模拟基因外显子之间的选择性剪切,该模型还能预测出数据中没有全长读段的异构体。本文采用模拟数据和真实数据验证了两个模型的有效性。
基金Guangxi Natural Science Foundation(2024GXNSFGA010003)National Natural Science Foundation of China(32270712 and 31871269)Guangxi Science and Technology Major Program(AA23062085).
文摘The transcriptome serves as a bridge that links genomic variation to phenotypic diversity.A vast number of studies using next-generation RNA sequencing(RNA-seq)over the last 2 decades have emphasized the essential roles of the plant transcriptome in response to developmental and environmental conditions,providing numerous insights into the dynamic changes,evolutionary traces,and elaborate regulation of the plant transcriptome.With substantial improvement in accuracy and throughput,direct RNA sequencing(DRS)has emerged as a new and powerful sequencing platform for precise detection of native and full-length transcripts,overcoming many limitations such as read length and PCR bias that are inherent to short-read RNA-seq.Here,we review recent advances in dissecting the complexity and diversity of plant transcriptomes using DRS as the main technological approach,covering many aspects of RNA metabolism,including novel isoforms,poly(A)tails,and RNA modification,and we propose a comprehensive workflow for processing of plant DRS data.Many challenges to the application of DRS in plants,such as the need for machine learning tools tailored to plant transcriptomes,remain to be overcome,and together we outline future biological questions that can be addressed by DRS,such as allele-specific RNA modification.This technology provides convenient support on which the connection of distinct RNA features is tightly built,sustainably refining our understanding of the biological functions of the plant transcriptome.