摘要
基因和异构体差异表达分析是获取基因和异构体功能的重要途径,现已成为生物信息学的一个重要领域。RNA-seq是一种高通量测序技术,近年来广泛用于转录组研究。RNA-seq数据的读段多源映射现象给差异异构体检测带来挑战。针对该问题,本文采用先计算基因和异构体的表达水平,再进行差异分析的方法,以计算表达水平的PGseq模型为基础,采用贝叶斯因子方法进行模型选择,提出一个新的差异检测方法 PG_bayes,解决了基因和异构体两方面的差异检测问题。将PG_bayes应用于人类和小鼠共4个真实数据集中,并与目前流行的差异检测方法进行对比。实验结果表明,PG_bayes方法在差异基因和差异异构体检测中具有较高的准确度和灵敏度,并且在差异异构体检测方面表现出优势。
Differential expression analysis of genes and isoforms is important in obtaining the function of genes and isoforms,thus becoming an essential research focus of bioinformatics.RNA-seq is a new experimental technique based on high-throughput sequencing and is increasingly used in transcriptome research.Read-isoform multi-mappings make it difficult to detect differential expression of isoforms.Here,we proposed a new method,called PG_bayes,to detect differential expression for both genes and isoforms.PG_bayes,based on expressions estimation method PGseq,uses a Bayes factor model selection method to detect differential expression.We applied PG_bayes to three human datasets and one mouse dataset,and compared its performance with popular alternatives.Results show that PG_bayes performs favorably in sensitivity and specificity at both gene and isoform levels.
出处
《数据采集与处理》
CSCD
北大核心
2016年第5期965-973,共9页
Journal of Data Acquisition and Processing
基金
国家自然科学基金(61170152)资助项目
中央高校基本科研业务费专项(CXZZ11_0217)资助项目