期刊文献+

改进SVM-KNN的不平衡数据分类 被引量:21

Improved SVM-KNN algorithm for imbalanced datasets classification
下载PDF
导出
摘要 针对支持向量机(SVM)在超平面附近进行不平衡数据(imbalanced datasets)分类的不准确性,提出了一种改进SVM-KNN算法,该算法在分类阶段计算测试样本与最优超平面的距离,如果距离差大于给定阈值可直接应用支持向量机分类;如果距离差小于给定阈值,则将所有支持向量都作为测试样本的近邻样本,进行KNN分类。通过对UCI数据集的大量实验表明,该算法在少数类样本的识别率和分类器的整体性能上有明显改善。 Improved KNN-SVM that combined Support Vector Machine(SVM)with K Nearest Neighbor(KNN)is presented to improve the accuracy of imbalanced classification nearby SVM hyperplane. In the class phase,the algorithm computes the distance from the tested sample to the optimal super-plane of SVM in the feature space. If the distance is greater than the given threshold,the tested sample will be classified on SVM;otherwise the SVs from different categories are used as the tested sample of nearest neighbors,the tested sample will be classified on KNN. A large amount of experiments by the UCI dataset show that the algorithm can significantly improve the identification rate of the minority samples and overall classification performance.
出处 《计算机工程与应用》 CSCD 北大核心 2016年第4期51-55,103,共6页 Computer Engineering and Applications
基金 国家自然科学基金(No.31170393) 陕西省自然科学基金(No.2012JM8023) 陕西省教育厅自然科学基金专项(No.12JK0726)
关键词 支持向量机 K近邻法 不平衡数据集 Support Vector Machine(SVM) K Nearest Neighbor(KNN) imbalanced datasets
  • 相关文献

参考文献17

二级参考文献82

  • 1苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:386
  • 2业宁,王迪,窦立君.信息熵与支持向量的关系[J].广西师范大学学报(自然科学版),2006,24(4):127-130. 被引量:10
  • 3施建宇,潘泉,张绍武,邵壮超,姜涛.基于多特征融合的蛋白质折叠子预测[J].北京生物医学工程,2006,25(5):482-485. 被引量:2
  • 4俞研,黄皓.基于改进多目标遗传算法的入侵检测集成方法(英文)[J].软件学报,2007,18(6):1369-1378. 被引量:21
  • 5Witten IH,Frank E.数据挖掘实用机器学习技术[M].2版.北京:机械工业出版社.2006:126-324.
  • 6Japkowicz N. Learning from imbalanced data sets: A comparison of various strategies, WS-00-05 [R]. Menlo Park, CA: AAAI Press, 2000
  • 7Chawla N V, Japkowicz N, Kotcz A. Editorial: Special issue on learning from imbalaneed data sets [J]. Sigkdd Explorations Newsletters, 2004, 6( 1 ) : 1-6
  • 8Weiss Gary M. Mining with rarity: A unifying frameworks [J]. SIGKDD Explorations Newsletters, 2004, 6(1): 7-19
  • 9Maloof M A. Learning when data sets are imbalanced and when costs are unequal and unknown [OL]. [2008-01-06]. http://www. site. uottawa. ca/-nat/workshop2003/workshop 2003. html
  • 10Chawla N V, Hall L O, Bowyer K W, et al. SMOTE: Synthetic minority oversampling technique [J]. Journal of Artificial Intelligence Research, 2002, 16 : 321-357

共引文献2494

同被引文献201

引证文献21

二级引证文献68

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部