摘要
为提高随机森林算法的分类精度,在分析影响随机森林算法分类精度的几个要素后,针对随机森林中决策树的多样性,提出基于计算节点匹配代价的方法对随机森林算法进行优化。通过对比随机森林中所有决策树的分支点和分支点属性,利用匈牙利算法计算出两个决策树节点间最高匹配代价,建立决策树间的相似度矩阵。在相似度矩阵的基础上,通过谱聚类算法对决策树进行聚类分析,保留每类决策树中Kappa系数最高的决策树构建新的随机森林,依据其自身的Kappa系数对判决结果进行加权处理。实验结果表明,聚类加权优化后的随机森林算法的分类精度优于传统的随机森林算法。
To improve the classification accuracy of random forest algorithm,after analyzing several factors affecting the classi-fication accuracy of random forest algorithm,a method based on computational node matching cost was proposed to optimize the random forest algorithm.The branch point and branch point attributes of all decision trees in the random forest were compared,and the highest matching cost between the two decision tree nodes was calculated using the Hungarian algorithm to establish the similarity matrix of the decision tree.Based on the similarity matrix,the decision tree was clustered using spectral clustering algorithm.The decision tree with the highest Kappa coefficient in each decision tree was constructed to construct a new random forest,and the decision result was weighted according to its own Kappa coefficient.Experimental results show that the classification accuracy of the clustering weighted random forest algorithm is better than that of the traditional random forest algorithm.
作者
朱瑛
谢睿
郑若池
ZHU Ying;XIE Rui;ZHENG Ruo-chi(School of Mechanics Engineering,Shenyang Aerospace University,Shenyang 110136,China)
出处
《计算机工程与设计》
北大核心
2020年第11期3106-3111,共6页
Computer Engineering and Design
基金
国家重点实验室基金项目(SKLRS-2013-MS-04)。
关键词
随机森林
决策树
匈牙利算法
谱聚类
Kappa系数
聚类优化
random forest
decision tree
Hungary algorithm
spectral clustering
Kappa coefficient
clustering optimization