摘要
针对传统ID3算法无法处理属性值连续的数据集,设计了一种新的改进算法用于连续评价数据的处理。改进算法先用聚类算法对连续属性值进行离散化,再计算属性的粗糙度作为属性分裂的标准,最后用改进的ID3算法生成决策树。通过仿真验证了该方法的预测正确率,并探讨其应用条件。实验结果表明,在不降低正确率的情况下,该算法可处理属性值连续的数据且具有更好的可读性及更低的运算量。
The traditional ID3 algorithm can′t process the dataset with continuous attribute value.Therefore,an improved ID3 algorithm is designed to process the continuous evaluation data.The clustering algorithm is used in the improved algorithm to discrete the continuous attribute values,and then the roughness of the attribute is calculated as the divisive standard of the attribute.The improved ID3 algorithm is adopted to generate the decision tree.The prediction accuracy of the method is verified with simulation,and its application condition is discussed.The experimental result shows that the improved algorithm can process the data with continuous attribute value,and has high readability and less computational amount while maintaining the accuracy.
作者
王子京
刘毓
WANG Zijing;LIU Yu(School of Communications and Information Engineering,Xi’an University of Posts&Telecommunications,Xi’an 710121,China)
出处
《现代电子技术》
北大核心
2018年第15期39-42,共4页
Modern Electronics Technique
基金
陕西省工业攻关(2016GY-113)~~
关键词
数据挖掘
决策树
粗糙集
ID3算法
大数据
算法改进
data mining
decision tree
rough set
ID3 algorithm
big data
algorithm improvement