摘要
传统的分类算法在处理不平衡数据分类问题时会倾向于多数类,而导致少数类的分类精度较低。针对不平衡数据的分类,首先介绍了现有不平衡数据分类的性能评价;然后介绍了现有常用的基于数据采样的方法及现有的分类方法;最后介绍了基于数据采样和分类方法结合的综合方法。
Imbalanced data set cause the deduction of the precision of the minority class samples, when it is classified by traditional algorithm, which can tend to favor the more class samples. In view of the imbalanced data classification, this paper firstly introduced the developed methods that were the performance evaluation of imbalaneed data classification. Secondly it presented the developed sampling methods regarding imbalaneed data set and produced the classified methods, In the end, it showed the union methods of using sampling method and classified method.
出处
《计算机应用研究》
CSCD
北大核心
2008年第5期1301-1303,1308,共4页
Application Research of Computers
关键词
机器学习
不平衡数据
数据分类
machine learning
imbalaneed data set
data classification