摘要
传统决策树在对不平衡数据进行分类时,提高正类的权重和舍弃部分负类的信息,造成负类的预测精度较低。文章引入强化学习思想,提出一种基于马尔可夫决策过程的改进决策树方法。根据马尔可夫决策过程、当前分裂特征的标准化互信息和马修斯相关系数作为信息增益率的奖励或者惩罚,形成新的特征选择标准。实验结果表明,与其他传统方法相比,改进的马尔可夫决策树对非平衡数据整体的预测精度及负类预测精度均有提高。
The traditional decision tree enhances the samples weight of positive class and discards some samples information of negative class when it classifies unbalanced data.That method results in low prediction accuracy of negative class.So,an improved decision tree method based on Markov decision process is proposed by introducing the reinforcement learning.According to the Markov decision process,the normalized mutual information and Matthews correlation coefficient of current splitting feature are taken as the reward parameter or punishment parameter of the information gain ratio,which becomes the new feature selection criterion.The experimental results show that the improved Markov decision tree algorithm increases the overall prediction accuracy and the prediction accuracy of negative class for unbalanced data compared with the traditional decision tree algorithms.
作者
于安池
储茂祥
杨永辉
董秀
YU Anchi;CHU Maoxiang;YANG Yonghui;DONG Xiu(School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan 114000, China;Department of Automotive Engineering, Yantai Automobile Engineering Professional College, Yantai 265500, China)
出处
《合肥工业大学学报(自然科学版)》
CAS
北大核心
2021年第5期616-620,共5页
Journal of Hefei University of Technology:Natural Science
基金
国家自然科学基金资助项目(71771112)
辽宁省自然科学基金资助项目(20180550067)
辽宁省高等学校基本科研资助项目(2017LNQN11)。
关键词
决策树
不平衡数据
强化学习
标准化互信息
马修斯相关系数
decision tree
unbalanced data
reinforcement learning
normalized mutual information
Matthews correlation coefficient