期刊文献+

基于目标特征选择和去除的改进K-means聚类算法 被引量:17

Improved K-means clustering algorithm based on feature selection and removal on target point
原文传递
导出
摘要 针对高维数据聚类中K-means算法无法有效抑制噪声特征、实现不规则形状聚类的缺点,提出一种基于目标点特征选择和去除的改进K-均值聚类算法.该算法使用闵可夫斯基规度作为评价距离进行目标点的分类,增设权重调节参数a、重置权重系数α进行特征选择和去除,可有效减小非聚类指标特征带来的噪声影响.算法验证实验选取UCI真实数据集和人工数据集进行聚类分析,验证改进算法对抑制噪声特征的有效性,与WK-means、iMWK-means算法进行实验对比,分析聚类学习时特征选择的适用性,同时寻找最优的距离系数β和权重系数α. Aiming at the weakness that the K-means algorithm cannot effectively suppress the noise attributes and realize irregular shape clustering on high-dimensional data,an improved K-means clustering algorithm based on feature selection and removal on target point is proposed.In the improved K-means algorithm,the Minkowski metric is adopted as the evaluation of distance for the classification of the target point.The weighting adjustment parameter a is added and the weighting coefficientαis reset for feature selection and removal,which can reduce the effect of non-clustering index noise features.The UCI real datasets and artificial datasets are used for clustering analysis in the algorithm validation experiment.And the effectiveness of suppressing the noise features is validated.Compared with the WK-means and iMWK-means algorithms in the validation experiment,the applicability of feature selection in clustering learning process is analyzed.At the same time,the optimal distance coefficientβand the weighting coefficientαare found.
作者 杨华晖 孟晨 王成 姚运志 YANG Hua-hui;MENG Chen;WANG Cheng;YAO Yun-zhi(Department of Missile Engineering,Army Engineering University,Shijiazhuang 050003,China)
出处 《控制与决策》 EI CSCD 北大核心 2019年第6期1219-1226,共8页 Control and Decision
基金 国家自然科学基金项目(61501493)
关键词 K-均值算法 特征选择 高维数据聚类 特征赋权 数据去噪 K-means algorithm feature selection high-dimensional data clustering feature weighting data denoising
  • 引文网络
  • 相关文献

参考文献5

二级参考文献44

  • 1蒋盛益,李庆华.一种基于引力的聚类方法[J].计算机应用,2005,25(2):286-288. 被引量:9
  • 2Chow C K, Zhu H L, Lacy J, et al. A cooperative feature gene extraction algorithm that combines classification and clustering[C]. IEEE Int Conf on Bioinformatics and Biomedicine Workshop. New York: IEEE Press, 2009: 197-202.
  • 3Matsumoto T, Hung E. Fuzzy clustering and relevance ranking of web search results with differentiating clustering label generation[C]. IEEE Int Conf on Fuzzy Systems. New York: IEEE Press, 2010: 1-8.
  • 4Ukkonen A. Clustering algorithms for chains[J]. Machine Learning Research, 2011, 12: 1389-1423.
  • 5Frey B J, Dueck D. Clustering by passing messages between data points[J]. Science, 2007, 315: 972-976.
  • 6Shamir O, Tishby N. Stability and model selection in k-means clustering[J]. Machine Learning. 2010, 80(2/3): 213-243.
  • 7Lingras P, Yan R, West C. Comparison of conventional and rough k-means clustering[C]. Int Conf on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, Lecture Notes in Artificial Intelligence. Berlin: Springer, 2003: 130-137.
  • 8Lingras P, West C. Interval set clustering of web users with rough k-means[J]. J of Intelligent Information Systems, 2004, 23(1): 5-16.
  • 9Maji P, Pal S K. Rough set based generalized fuzzy c- means algorithm and quantitative indices[J]. IEEE Trans on Systems, Man, and Cybernetics, Part B: Cybernetics, 2007, 37(6): 1529-1540.
  • 10Peters G. Some refinements of rough k-meansclustering[J]. Pattern Recognition, 2006, 39(8): 1481- 1491.

共引文献50

同被引文献147

引证文献17

二级引证文献74

相关主题

;
使用帮助 返回顶部