摘要
虽然现有的DNA剪接位点辨识算法取得很高的辨识精度,但是大多数方法计算量很大。朴素贝叶斯分类器是一种简单而高效的分类器,但是它的属性独立性假设使其无法表示现实世界属性之间的依赖关系,影响了它的分类性能。将朴素贝叶斯分类器进行改进,推导出决策属性和各条件属性对数值间存在线性关系,并用最小二乘法求出这种线性关系系数,设计出一种新的贝叶斯分类器。将改进的贝叶斯分类器应用于DNA序列剪接位点的辨识中。仿真结果表明,本算法计算时间和测试样本的数量成线性关系,辨识精度较朴素贝叶斯分类器有明显提高,同时高于现有辨识算法。
Due to the enormous amount of DNA sequences to be processed,the computational speed is an important issue to be considered.Although relatively high accuracy has been achieved by existing methods,most of these prediction methods are computationally intensive.Na?ve bayesian classifier is a simple and efficient classifier.But the attribute independence assumption can not represent the dependency relationship between attributes in the real word,and the classification performance may be affected to some extent.The improvement of the na?ve Bayesian was made.The linear relationship between condition attributes and decision attribute was derived and the relationship coefficients was determined by least square method.So a new bayesian classifier was designed.The proposed method was applied to the recognition of splice sites in DNA sequences.The simulation results show the performance is notably improved compared with the na?ve bayesian classifier and the existing discovery tools,while the speed of the proposed method is significantly faster.
出处
《系统仿真学报》
CAS
CSCD
北大核心
2011年第7期1429-1432,共4页
Journal of System Simulation
基金
国家自然科学基金(60671061)
校博士启动基金项目7254(沈阳化工大学)
沈阳市科技项目应用基础研究计划(1081236-1-00)
辽宁省教育厅科研项目计划(L2010438)