摘要
广义隐Markov模型是计算机基因识别的一种重要模型,它克服了传统隐Markov模型的状态段长成几何分布的缺陷,更加适合于计算机基因识别。其缺点在于计算量大,需要采用有效的简化算法。利用基因的结构特点,在不附加额外限制条件的情况下,提出了一种新的简化算法,其计算复杂度是序列长度的线性函数。对实际生物序列数据的测试结果表明了此简化算法的有效性。
The generalized hidden Markov model (GHMM) is an important model for computational gene finding. Compared with the traditional hidden Markov model (HMM), GHMM needn't the assumption that the length of each state is geometrical distribution, while it is necessary for HMM. This property is appropriate for computational gene finding. The demerit of GHMM is its high computational complexity, which hinders it from being used practically. According to the characteristic of gene's structure, a novel simplified algorithm is proposed without any additional assumptions, and its computational complexity is linear with the length of sequence. The testing result for biological data demonstrates that the simplified algorithm is effective.
出处
《国防科技大学学报》
EI
CAS
CSCD
北大核心
2004年第4期103-106,共4页
Journal of National University of Defense Technology
基金
军队基础研究项目(JC-02-03-021)