期刊文献+

一种聚类欠采样策略的随机森林优化方法 被引量:5

Random Forest Optimization Method Based on Cluster Undersampling Strategy
下载PDF
导出
摘要 针对随机森林分类效果受样本集类间不平衡、类内不规则的影响,提出一种聚类欠采样策略的随机森林优化方法。该方法对原始数据大类样本聚类,得到与小类样本个数相同的子类簇;从每个子类簇中随机有放回抽取一个样本与小类样本合并,形成平衡样本集;对平衡样本集进行有放回随机抽样,形成单棵决策树的训练样本集并完成建树;将两次未被抽中的样本作为袋外数据,用于模型测试;重复上述过程多次,形成随机森林。使用10组非平衡数据集进行实验验证,结果表明,该方法在这10组数据集上的分类能力及稳定性均优于传统随机森林。 Aiming at the random forest classification effect,which is affected by the imbalance between sample sets and intra-class irregularities,this paper proposes a random forest optimization method based on cluster undersampling strategy.The method clusters the original data large sample,and obtains the same sub-class cluster as the small-class sample.From each sub-cluster,a sample is randomly selected and merged with the small-class sample to form a balanced sample set.The sample set is subjected to returning random sampling to form a training sample set of a single decision tree and completing the construction.Some samples are not extracted twice before and after,it will be used as out of bag data for model testing.The above process is repeated multiple times to form a random forest.Experiments are carried out by using 10 sets of unbalanced data sets.The results show that the classification ability and stability of the method on these 10 sets of data sets are better than traditional random forests.
作者 罗计根 杜建强 聂斌 李欢 聂建华 陈裕凤 LUO Jigen;DU Jianqiang;NIE Bin;LI Huan;NIE Jianhua;CHEN Yufeng(School of Computer Science,Jiangxi University of Traditional Chinese Medicine,Nanchang 330004,China;School of Chinese Medicine,Jiangxi University of Traditional Chinese Medicine,Nanchang 330004,China)
出处 《计算机工程与应用》 CSCD 北大核心 2020年第22期166-172,共7页 Computer Engineering and Applications
基金 国家自然科学基金(No.61562045) 江西省科技厅重点研发计划(No.20171ACE50021) 江西省卫生计生委中医药科研计划(No.2017A282)。
关键词 随机森林 非平衡数据 聚类分析 中医药信息学 random forest unbalanced data cluster analysis Chinese medicine informatics
  • 相关文献

参考文献6

二级参考文献67

共引文献131

同被引文献66

引证文献5

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部