期刊文献+

数据挖掘中分类算法综述 被引量:45

A Review on Classification Algorithms in Data Mining
原文传递
导出
摘要 对分类算法中需要解决的关键问题进行了分析;综述了不同分类算法的思想和特性,决策树分类算法能够很好地处理噪声数据,但只对规模较小训练样本集有效;贝叶斯分类算法精度高、速度快,错误率低,但分类不够准确;传统的基于关联规则算法分类准确率高,但容易受硬件内存的制约;支持向量机算法分类准确率高、复杂度低,但速度慢。针对各种分类算法的缺陷,结合其优点,论述了当前一些速度更快、准确率更高、能实现更好分类效果的新算法,如多决策树综合技术、基于先验信息和信息增益的混合分类算法,基于粗糙集和遗传算法的神经网络分类算法等;对数据挖掘分类算法作了展望,提出今后的研究重点。 In this paper, we analyzed some key problems that must be solved in classification. Then, the idea and characteristic of main kinds of classification algorithms are reviewed. Decision tree algorithm can handle noise data well but is only effective to small datasets. Bayesian has the merits of high accuracy, fast speed, low mistake rate and demerits of low accuracy. Classification based on association rule has advantages of high accuracy but is limited to random access memory. Support vector machine has the merits of high accuracy, low complexity but shows bad time complexity. According to the advantages and disadvantages of the well-known algorithms, some recent proposed classification algorithms which achieve better performance are addressed, such as multi-decision fusion technology, the hybrid classification algorithm based on Bayesian and information gain, and neural network classification algorithm based on rough set and genetic algorithm etc. Finally, research emphasis in the future is discussed.
作者 李玲俐
出处 《重庆师范大学学报(自然科学版)》 CAS 2011年第4期44-47,共4页 Journal of Chongqing Normal University:Natural Science
关键词 数据挖掘 分类 综述 data mining classification review
  • 相关文献

参考文献23

二级参考文献196

共引文献274

同被引文献348

引证文献45

二级引证文献350

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部