摘要
针对普遍存在的数据不精确与不完备性,提出变精度粗糙集的自适应决策树算法。该算法可以有效抵抗噪声干扰、降低树的复杂度,使用近似精度作为属性选择标准,并以属性粗糙熵辅助决策。通过合理的阈值设置可以提高算法分类精度和抗噪声能力,同时可以自适应地控制决策树的规模。通过3个UCI数据库对比验证,所提算法在不精确不完备数据环境下可获得良好分类结果。
The prevailing data inaccuracy and incompleteness make it difficult to apply decision tree algorithm. Since variable precision rough sets model can effectively resist the noise and reduce the complexity of the tree,this paper uses the approximate precision as the attribute selection standard and takes the rough-entropy attribute to assist decision. Through reasonably setting the threshold value,it can improve the accuracy of algorithm classificaiton and anti-noise-interference ability,and achieve adaptive control of the decision tree's size as well. By comparing three UCI databases,the proposed algorithm can realize good classification results with inaccurate and incomplete data.
作者
曹宁
刘旭光
刘方
CAO Ning;LIU Xuguang;LIU Fang(Chinese Armed Police Force,Jinan 250014,China;School of Electronic Science and Engineering,National University of Defense Technology,Changsha 410073,China)
出处
《信息工程大学学报》
2018年第2期191-195,共5页
Journal of Information Engineering University
基金
国家863计划资助项目(2015AA7026087)
关键词
变精度粗糙集
决策树
近似精度
粗糙熵
C4.5
variable precision rough set
decision tree
approximate precision
rough entropy
C4.5