摘要
为了对电力企业中不同部门的运行数据进行有效的挖掘,提出利用C5.0决策树算法对数据进行深层次分析,为管理人员提供有价值的决策支持。首先,对数据挖掘中先进的C5.0决策树算法原理进行分析,并通过引入信息熵对原有的属性选择方式进行改进,提高了信息增益比率计算的速度。然后根据设计的售电量关系模型进行对电厂管理信息系统中的数据进行挖掘。在UCI机器学习数据集和电力营销数据集上的实验结果表明,提出的改进C5.0决策树算法具有良好的分类性能,能够对售电市场进行快速、准确的用户分类,准确率达到86.5%。
In order to effectively mine the operation data from different departments in electric power enterprises,a decision tree algorithm of C5.0 is proposed and analyzed deeply,so as to provide valuable decision support information for managers.Firstly,the principle of advanced C5.0 decision tree algorithm in data mining is analyzed,and the original attribute selection method is improved by introducing information entropy,which improves the calculation speed of information gain ratio.Then,according to the designed sales relationship model,the data in the power plant management information system are mined.The experimental results on UCI machine learning data set and power marketing data set show that the improved C5.0 decision tree algorithm has good classification performance,and can quickly and accurately classify users in the power sales market,with an accuracy rate of 86.5%.
作者
卜晓阳
蔡岩
王宗伟
赵郭燚
BU Xiaoyang;CAI Yan;WANG Zongwei;ZHAO Guoyi(Customer Service Center, State Grid, Tianjin 300309, China;Software College, Hebei Normal University, Shijiazhuang 050024, China)
出处
《微型电脑应用》
2022年第1期23-26,共4页
Microcomputer Applications
基金
河北省高校青年创新人才类项目(2019HBQNCX032)。
关键词
数据挖掘
C5.0决策树
电力营销
信息熵
分类预测
data mining
C5.0 decision tree
electric power marketing
information entropy
classification prediction