期刊文献+

一种优化的贝叶斯分类算法 被引量:14

An Optimal Bayes Classification Algorithm
下载PDF
导出
摘要 贝叶斯分类方法因具有严密的数学理论基础,于是成为一种简单而有效的数据挖掘方法;然而,贝叶斯分类器要求——条件独立性假设和每个属性权值为1,这极大降低了贝叶斯分类器的性能;针对贝叶斯分类器的局限性,文章提出了一种优化的贝叶斯分类算法;文中,首先利用粗糙集理论对待分类数据集进行属性约简,删除冗余属性;然后给出了属性权值的计算方法和公式,目的在于更准确地描述数据集的重要性和相关性;同时,通过weka3.6.2工具,以UCI机器学习数据库中的数据集为测试数据,进行了对比测试;实验结果表明:OBCA具有较高的分类准确率。 Bayesian classification method is a simple and effective data mining method because it's based on a rigorous mathematical theory.However,the performance of Bayesian classifier is reduced by conditional independent assumption and the weight of each attribute value of one.Therefore,this paper puts forward an Optimal Bayes Classification Algorithm in order to solve these shortcomings.Firstly,the data sets to be classified will be removed redundant attribute and attribute reduction with rough set theory.Then,the article gives the calculation methods and formulas of the attribute weight value in order to describe the importance and correlation of data sets more accurately.Performance evaluation of OBCA is done by comparison test in data sets of UCI machine learning database with weka3.6.2.The experimental result shows that it has higher classification accuracy than others.
出处 《计算机测量与控制》 CSCD 北大核心 2012年第1期199-201,共3页 Computer Measurement &Control
基金 国家自然科学基金(60573145) 教育部博士点基金(200805610019) 广州市科技计划项目(2007J1-C0401)
关键词 贝叶斯分类算法 属性约简 重要性 相关性 bayes classification algorithm attribute reduction importance correlation
  • 相关文献

参考文献12

二级参考文献74

  • 1王双成,苑森淼.具有丢失数据的贝叶斯网络结构学习研究[J].软件学报,2004,15(7):1042-1048. 被引量:62
  • 2邓维斌,王国胤,王燕.基于Rough Set的加权朴素贝叶斯分类算法[J].计算机科学,2007,34(2):204-206. 被引量:43
  • 3Witten I H, Frank E. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations[M]. Seattle, USA: Morgan Kaufmann Publish, 2000.
  • 4Wang Zhihai, Webb G I. A Heuristic Lazy Bayesian Rule Algorithm[C]//Proc. of Australian Data Mining Workshop. Sydney, Australia: Sydeny University of Technology Press, 2002: 57-63.
  • 5Elkan C.Boosting and naive Bayesian learning,in Technical Report CS97[R].San Diego:Dept.of Computer Science and Engineering,Univ Calif at San Diego,1997.
  • 6HanJianwei.Data mining concepts and techniques[M].北京:机械工业出版社,2001.30-50.
  • 7Pawlak Z.Rough sets[M].London:Kluwer academic publishers,1991.10-60.
  • 8Pawlak Z.Rough sets:probabiIistic versus deterministic approach[J].International Journal of Man-Machine Studies,1998,29:81-95.
  • 9MitchellTM.Machinelearning[M].北京:机械工业出版社,2003.80-90.
  • 10张文修 吴伟志 梁吉业.粗糙集理论与方法[M].北京:科学出版社,2003.107-112.

共引文献130

同被引文献116

引证文献14

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部