期刊文献+

面向不平衡数据的特征加权聚类算法 被引量:4

Feature Weighted Clustering Algorithm for Unbalanced Data
下载PDF
导出
摘要 不平衡数据集类别分布严重倾斜,传统的聚类算法由于以提高整体学习性能为目标,往往偏向于聚集多数类,而忽视更有价值的稀有类.本文提出一种基于迭代的特征加权聚类算法,根据当前聚类后簇的特点以及特征重要性度量函数确定特征权值,利用所得权值进行下一轮聚类,直到权值稳定后结束迭代.在多个UCI不平衡数据集上的实验效果表明,本文算法能够较好地识别出重要特征并提高它们的权重,避免聚类算法过度偏向多数类,有效地提高了聚类性能. The class distribution in imbalanced data sets is serious inclined, as the traditional clustering algorithm mainly designed for improving the overall learning performance, the majority class usually tends to be clustered and the minority class which is more valu- able would be ignored. In This paper, a new weighted clustering algorithm based on iteration is proposed. It updates feature weights according to the current character of clusters and a formula of feature importance measurements. With the calculated weights, cluster is processed iteratively until the weights are stable. Experimental results on the imbalance datasets in UCI shows that the new cluste- ring algorithm can recognize and enhance the weights of important feature, which avoids the clustering algorithm in favor of the ma- jority class and effectively improve the clustering performance.
出处 《小型微型计算机系统》 CSCD 北大核心 2013年第8期1809-1812,共4页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61070061)资助 广州市科技计划项目(2011J5100004)资助 广州市越秀区科技计划项目(2012-TP-005)资助
关键词 不平衡数据 一趟聚类 特征加权 imbalanced data single-pass clustering algorithm feature weighted
  • 相关文献

参考文献6

二级参考文献71

  • 1张丽新,王家廞,赵雁南,杨泽红.基于Relief的组合式特征选择[J].复旦学报(自然科学版),2004,43(5):893-898. 被引量:44
  • 2王颖,谢剑英.一种自适应蚁群算法及其仿真研究[J].系统仿真学报,2002,14(1):31-33. 被引量:232
  • 3张纪会 徐心和.带遗忘因子的蚁群算法[J].系统仿真学报,2000,(2).
  • 4Kohavi R, John G. Wrappers for feature subset selection [J]. Artificial Intelligence (S0004-3702), 1997, 97(1-2): 273-324.
  • 5Tahir MA, Bouridane A, Kurugollu F. Simultaneous feature selection and feature weighting using Hybrid Tabu Search/K-nearest neighbor classifier [J]. Pattern Recognition Letters (S0167-8655), 2007, 28(4): 438-446.
  • 6Uncu O, Turksen IB. A novel feature selection approach: Combining feature wrappers and filters [J]. Information Sciences (S0020-0255), 2007, 177(2): 449-466.
  • 7Marco Dorigo, Gianni Di Caro. The ant colony optimization meta-heuristic [C]// New ideas in optimization. Maidenhead, UK: McGraw- Hill Ltd, UK, 1999: 11-32.
  • 8E Yom-Tov, G F Inbar. Selection of Relevant Features for Classification of Movements from Single Movement-Related Potentials Using a Genetic Algorithm [C]//23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Istanbul, Turkey, 25-28 October 2001. USA: IEEE, 2001: 1364-1366.
  • 9ZF Hao, RC Cai, H Huang. An adaptive Parameter Control Strategy for ACO [C]// Proceeding of the Fifth International Conference on Mechine Learning and Cybernetics, Dalian, PRC, 13-16 August 2006 203-206.
  • 10Qiang Li. Recent Progress in computer-aided diagnosis of lung nodules on thin-section CT [J]. Computerized Medical Imaging and Graphics (S0895-6111), 2007, 31(4-5): 248-257.

共引文献379

同被引文献42

引证文献4

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部