摘要
分别基于Markov链模型、频率分析和加权Markov链模型分析k-mer(主要考虑k=6的情形)在DNA序列中的使用情况,并以此定义模糊相对熵度量2个DNA序列结构的差异程度.将转录频率较低的启动子序列作为对照,分析其它转录频率不同的酵母基因启动子序列与对照序列中k-mer隶属度的模糊相对熵的变化,发现基因转录频率与模糊相对熵存在线性正相关关系.一般地,转录频率相差越大的基因,其启动子序列结构的差异越明显.这提示酵母基因启动子序列结构与基因转录频率有一定关联性.与Markov链模型和频率分析法比较,加权Markov链模型的模糊相对熵能更有效地度量基因启动子序列结构的差异.
The usages of k-mer(mainly k = 6) were analyzed based on Markov chain model,k-mer occurrence frequencies and weighted Markov chain model respectively,and a fuzzy relative entropy was defined to measure the difference between the structures of two DNA sequences.The lowly-transcribed gene promoters were taken as a control set,the fuzzy relative entropies of other promoters of the genes with higher transcriptional frequencies were calculated.It was found that there was a linear positive correlation between the fuzzy relative entropies and transcription frequencies.In general,the more different between the transcriptional frequencies of two genes,the more different between their sequences structures of the promoters.This result suggested that there existed a certain correlation between the sequences structures of promoters and the transcriptional frequencies in yeast genes.Compared with the fuzzy relative entropy under Markov chain model or k-mer occurrence frequency,the fuzzy relative entropy under weighted Markov chain model could measure the difference of gene promoter sequence structures more effectively.This showed that the weighted Markov chain model,especially 4-order model,could better characterize the sequence organizations of yeast gene promoters.
出处
《云南大学学报(自然科学版)》
CAS
CSCD
北大核心
2010年第5期594-600,607,共8页
Journal of Yunnan University(Natural Sciences Edition)
基金
国家自然科学基金资助项目(30360027)
云南省应用基础研究基金资助项目(2007A023M)