摘要
决策树是解决分类问题的重要方法,目前已经提出多种决策树算法,如ID3、C4. 5和CART,它们代表属性分割的普遍标准,即香农熵、信息增益率和基尼系数,但这些算法相互独立,各自优势无法统一。针对该问题,通过研究Tsallis熵,提出Tsallis算法,并基于该算法构建决策树,最后在此基础上提出关键度度量的概念,进一步弥补了叶节点以"少数服从多数"标识分类的缺陷。实验结果表明:该方案精度高、规模小。
Decision tree is an important method to solve the classification problem and there are many attribute classification criteria, such as ID3, C4.5, CART, which represent universal criteria for attribute segmentation, namely shannon entropy, information gain rate and gini coefficient. But these algorithms are independent of each other, their own advantages cannot be unified. To solve the problem, Tsallis algorithm is proposed and a decision tree is constructed based on Tsallis algorithm by studying Tsallis entropy. Finally, the concept of key measure is proposed, which further compensates the classification of the leaf nodes with the “minority obey the majority” defect. Experiments show that the scheme has high precision and small scale.
作者
李梁
丛培强
陈亚茹
LI Liang;CONG Peiqiang;CHEN Yaru(School of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China)
出处
《重庆理工大学学报(自然科学)》
CAS
北大核心
2018年第10期143-148,共6页
Journal of Chongqing University of Technology:Natural Science
基金
重庆市研究生科研创新基金资助项目(CYS16222)
重庆理工大学研究生创新基金资助项目(YCX2016229)