摘要
针对背景知识数据集中存在的类不平衡对分类器的影响,根据背景知识数据集样本量小、数据维数高的特性分析了目前各种方法在解决背景知识数据中的类不平衡问题时的缺陷,提出了一种基于分类后处理的改进SVM算法。改进算法引入权重参数调整SVM的分类决策函数,提高少类样本对分类器的贡献,使分类平面向多类样本倾斜,从而解决类不平衡对SVM造成的影响。在MAROB数据集上的实验表明,改进算法对少类的预测效果要优于传统的机器学习算法。
Focusing on the drawback that the performance of standard classifiers is often severely hindered in practice due to the imbalanced distribution of class in the context knowledge dataset,this paper analyzed the current methods' problem in resolving the class imbalance problem in the context knowledge dataset according to its small sample and high dimension features.Then,proposed an improved SVM algorithm.The proposed SVM used a weight parameter to adjust the decision function and improved the small sample class's contribution to the classification,and made the plane to move towards to the larger sample class.Thus,in this way,it handled the problem caused by the class imbalance.The experiment result based on MAROB dataset shows that the proposed algorithm is better than the standard classifiers to handle the class imbalance problem in the context knowledge dataset.
出处
《计算机应用研究》
CSCD
北大核心
2011年第8期2902-2904,2908,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(60773049)
江苏大学高级人才启动基金资助项目(09JDG041)