摘要
为进一步提高随机森林算法分类准确率,提出一种基于决策边界的倾斜森林(oblique forests based on decision boundary,OFDB)分类算法。将决策边界与自适应权重融入随机森林算法框架,采用决策边界作为分裂准则,使原本垂直于数据空间的分裂准则变为倾斜的超平面,有效提高算法对数据空间结构的适应能力。自适应权重改进叶子结点类标号计算方法,有效提高算法对不平衡数据的分类能力。实验结果表明,该算法与随机森林算法相比具有更高的分类准确率与较好的不平衡数据分类能力。
To improve the classification accuracy of random forest algorithm,the oblique forests algorithm based on decision boundary was proposed.Decision boundary and adaptive weights were combined with random forests framework.Decision boundary was used to replace the original splitting criterion which was perpendicular to the data space,which effectively improved the algorithm’s ability to adapt to the data space structure.Adaptive weights were used to improve the calculation method of leaf node class labels,which effectively improved the algorithm’s ability to classify imbalanced data.Experimental results indicate that the classification accuracy and imbalanced data classification ability of this algorithm are better than that of random forests algorithm.
作者
阚学达
桂琼
张攀峰
KAN Xue-da;GUI Qiong;ZHANG Pan-feng(College of Information Science and Engineering,Guilin University of Technology,Guilin 541004,China)
出处
《计算机工程与设计》
北大核心
2022年第2期391-398,共8页
Computer Engineering and Design
基金
国家自然科学基金项目(61862019)
广西自然科学基金项目(2017GXNSFAA198223)
广西科技基地和人才专项基金项目(2018AD19136)
桂林理工大学科研启动基金项目(GLUTQD2017065)。
关键词
分类
随机森林
逻辑回归
分裂准则
决策边界
classification
random forests
logistic regression
splitting criterion
decision boundary