期刊文献+

基于BLT方法的样本不平衡分类研究 被引量:1

Research on Sample Imbalance Classification Based on BLT Method
下载PDF
导出
摘要 针对样本数量分布不平衡的分类问题,使用分支学习树(BLT)的方法来提升分类精度,并应用于肿瘤免疫亚型分类问题,从而验证其有效性。统计每种免疫亚型的数量,以此建立一棵哈夫曼树,用传统分类器作为分支节点,进行自顶向下逐步分类方式,实现对不平衡数据的准确分类。使用BLT方法后,对比传统分类器分类准确率提升1.5%左右,在误分最严重的类别上,分类性能提升最高可达79%。上述方法可用于提升样本不平衡的分类问题的分类性能,且在样本数量较少的类别上效果尤为明显。 Aiming at the classification problem with unbalanced sample size distribution,the Branch Learning Tree(BLT)method is used to improve the classification accuracy and applied to the classification of tumor immune subtypes to verify its effectiveness.Count the number of each immune subtype,and build a Huffman tree based on it,use traditional classifiers as branch nodes,and perform a top-down gradual classifi⁃cation method to achieve accurate classification of unbalanced data.Results:After using the BLT method,the classification accuracy of the original classifier was improved by about 1.5%,and the classification performance was improved by up to 79%in the most misclassified cat⁃egory.The above method can be used to improve the classification performance of the classification problem with unbalanced samples,and the effect is particularly obvious on the categories with a small number of samples.
作者 白新宇 BAI Xin-yu(Guizhou Normal University,Guiyang 550000)
机构地区 贵州师范大学
出处 《现代计算机》 2021年第4期52-55,共4页 Modern Computer
关键词 分支学习树 样本不平衡 免疫亚型 哈夫曼树 Branch Learning Tree Sample Imbalance Immune Subtype Classification Huffman Tree
  • 相关文献

参考文献3

二级参考文献18

  • 1WEISS GM.Mining with rarity:A unifying framework[J].Chicago,IL,USA,SIGKDD Explorations,2004,6(1):7-19.
  • 2CHAWLA NV,BOWYER KW,HALL LO,et al.SMOTE:Synthetic Minority Over-Sampling Technique[J].Washington,USA,Journal of Artificial Intelligence Research,2002,16:321-357.
  • 3KUBAT M,MATWIN S.Addressing the Course of Imbalanced Training Sets:One-sided Selection[A].Proceedings of the Fourteenth International Conference on Machine Learning[C].San Francisco,CA,1997.179-186.
  • 4BATISTA GEAPA,PRATI RC,MONARD MC.A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data[J].Chicago,IL,USA,SIGKDD Explorations,2004,6(1):20-29.
  • 5JOSHI M,KUMAR V,AGARWAL R.Evaluating Boosting Algorithms to Classify Rare Classes:Comparison and Improvements[A].First IEEE International Conference on Data Mining[C].San Jose,CA,2001.
  • 6WU G,CHANG EY.Class-Boundary Alignment for Imbalanced Dataset Learning[A].Workshop on Learning from Imbalanced Datasets (ICML 03)[C].Washington DC,2003.49 -56.
  • 7HUANG KZ,YANG HQ,KING I,et al.Learning Classifiers from Imbalanced Data Based on Biased Minimax Probability Machine[A].Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition[C].2004.
  • 8MANEVITZ LM,YOUSEF M.One-class SVMs for document classification[J].Journal of Machine Learning Research,2001,2(2):139-154.
  • 9BLAKE C,MERZ C.UCI Repository of Machine Learning Databases[DB/OL].http://www.ics.uci.edu/~ mlearn/ ~ MLRepository.html,2005.
  • 10范明,范宏建.数据挖掘导论[M].北京:人民邮电出版社,2006.

共引文献21

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部