摘要
数据挖掘中一项重要的任务是对数据进行分类,而决策树是分类算法中一个最主要而且应用最广泛的算法分支。文章叙述了决策树的工作原理和基本概况,介绍了几种常用的决策树算法,并通过对5种公开数据集的分类结果进行比较,验证了这些算法的优劣,最后将随机森林算法应用到电力用户信用评价中,为进一步研究提供了思路。
An important task in data mining is to classify the data, and the decision tree is one of the most important and widely used algorithm branches in the classification algorithm. This paper describes the working principle and basic situation of the decision tree, and introduces several commonly used decision tree algorithms, and compares the classification results of five public data sets to verify the advantages and disadvantages of these algorithms. Finally the Random Forest algorithm is applied to the credit evaluation of the electricity user ,which provides the idea for further research.
作者
张海燕
刘岩
马丽萌
苑津莎
巨汉基
魏彤珈
Zhang Haiyan Liu Yan Ma Limeng Yuan Jinsha Ju Hanji Wei Tongjia(North China Eleetrie Power University, Baoding 071003, China State Grid Jibei Eleetrie Power Co. Ltd. Researeh Institute, North China Electric Power Research Institute Co. Ltd., Beijing 100045 ,China)
出处
《华北电力技术》
CAS
2017年第6期42-47,共6页
North China Electric Power
关键词
数据挖掘
分类算法
决策树
data mining, classification algorithm, decision tree