摘要
目的探讨隐马尔可夫模型在大肠杆菌编码序列识别中的应用,为生物信息挖掘、致病位点研究提供方法参考。方法对大肠杆菌训练集数据进行训练建模,并对测试序列进行识别,用特异度、灵敏度以及精确度三个指标进行评价。结果利用本试验的方法识别编码序列的灵敏度为73.33%,特异度为67.78%,精确度为70.56%。结论隐马尔可夫模型能很好地模拟离散状态间的转换,适用于识别有状态转移、线性序列的数据。
Objective To explore the identification of Escherichia coli coding sequence with Hidden Markov Model, so as to provide methods for the research of mining biological information and pathogenic loci. Methods We train the data set of Escherichia coli to model and identify the test set, and then evaluate the results using specificity, sensitivity and accuracy. Results The specificity is 67.78% ,the sensitivity is 73.33% and the accuracy is 70. 56% based on the method of the paper. Conclusion Hidden Markov Model can simulate the transformation of the discrete state very well, applied to identify the data of trans- formation state and linear sequence.
出处
《中国卫生统计》
CSCD
北大核心
2015年第2期254-256,共3页
Chinese Journal of Health Statistics
基金
国家自然科学基金资助项目(31071156)