摘要
近年来,RNA-seq技术被广泛应用于差异表达基因和异构体的检测,但目前大多数方法都是识别单个异构体的差异表达,无法同时检测同一个基因中所包含异构体表达比例的差异,因此提出一个差异异构体比例检测方法。该方法基于先前设计的sLDASeq模型,运用该模型中隐含变量的概率分布,采用KL散度进行差异异构体比例的分析。首先使用最新的SEQC数据集评估sLDASeq模型表达水平的性能,结果表明该方法能准确地估计基因中异构体的比例。接着通过模拟数据集进行差异异构体比例的检测,与其他方法相比,实验结果表明该方法在差异异构体比例检测方面具有较高的准确性。
RNA-seq technology has been widely applied in detecting differential gene and isoform expression. However, many methods have been developed for detecting difference in expression for each individual iso- form of a gene, rather than for the ratio of all the isoforms in the same gene. Now we present a new method to test each gene for differential isoform ratio between two conditions. The method is based on the previously designed sLDASeq and adopts the KL divergence for the detection of differential isoform ratio. We first use the new benchmark, SEQC, to validate sLDASeq's performance on gene and isoforrn expression calculation. The results show that the model can calculate the proportion of isoforms in a gene accurately. We then use the KL divergence of the probability of the latent variables of the sLDASeq to detect differential isoform ratios between the two conditions of simulation datasets. The results show that the proposed method has a high accuracy in comparison with other methods in detecting differential isoform ratio.
作者
欧书华
刘学军
张礼
OU Shu-hua LIU Xue-jun ZHANG Li(College of Computer Science & Technology, Nanjing University of Aeronautics& Astronautics, Nanjing 211106,China)
出处
《计算机工程与科学》
CSCD
北大核心
2017年第1期158-164,共7页
Computer Engineering & Science
基金
国家自然科学基金(61170152)