摘要
在实际应用中的分类数据往往是非平衡数据,少数类别的数据可能有很大的分类代价。分类性能不仅要考虑分类精度,同时要考虑分类代价。该文扩展了支持向量机(SVM)学习方法,对于以高斯核为核函数时的少数类和多数类使用不同的惩罚参数C+,C-以获得高敏感度的超平面,并提出利用遗传算法对SVM的学习参数进行优化调整。给出一种新的评价函数,对分类结果的质量进行评价。实验结果证明,算法对于非平衡数据的分类有较好的效果,对少数类样本预测的准确性较高。
In practice, training data is usually imbalanced, one class is "rare" relative to the other, and misclassification cost of the rare class may be much greater than the cost of the other class. In this situation, accuracy and the misclassification cost should be considered. This paper extends the Support Vector Machine(SVM) learning method, based on the Gauss kernel, by the use of C+( the weight assigned to the rare class), and C (the weight assigned to the other class)to train more sensitive hyperplane, which is optimized by generic algorithm. Meanwhile, a new sensitive quality measure function is introduced in the optimization process. Experimental results show that the optimized algorithm has competitive performance when dealing with the rare class in the imbalance training data.
出处
《计算机工程》
CAS
CSCD
北大核心
2008年第20期198-199,202,共3页
Computer Engineering
关键词
支持向量机
非平衡数据
评价函数
学习参数优化
Support Vector Machine(SVM)
imbalance data
measure function
learning parameters optimization