期刊文献+

结合mean-shift与MST的K-means聚类算法 被引量:5

K-means Clustering Algorithm Combined with mean-shift and Minimum Spanning Tree
下载PDF
导出
摘要 针对初始点选择不当导致K-means陷入局部最小值问题,提出一种结合自适应mean-shift与最小生成树(MST)的K-means聚类算法。将数据对象投影到主成分分析(PCA)子空间,给出自适应mean-shift算法,并在PCA子空间内将数据向密度大的区域聚集,再利用MST与图连通分量算法,找出数据的类别数和类标签,据此计算原始空间的密度峰值,并将其作为K-means聚类的初始中心点。对K-means的目标函数、聚类精度和运行时间进行比较,结果表明,该算法在较短的运行时间内能给出较优的全局解。 Given an inappropriate set of initial clustering centroids, K-means algorithm can get trapped in a local minimum. To remedy this, this paper proposes a K-means clustering algorithm combined with adaptive mean-shift and Minimum Spanning Tree(MST). The original data set is projected into Principal Component Analysis(PCA) subspace. An adaptive Mean-shift is proposed and run in the PCA subspace to let the data move to dense regions, and via the MST and graph connected component algorithm, it finds the number of clusters and the cluster indicators. According to the indicators, the density peaks are computed in the full space and taken as the initial centroids for K-means clustering. Experimental results show that the proposed algorithm can provide better global solution and higher clustering accuracy within a shorter period of execution time.
作者 徐沁 罗斌
出处 《计算机工程》 CAS CSCD 2013年第12期204-210,共7页 Computer Engineering
基金 国家自然科学基金资助项目(61073116 61211130309)
关键词 聚类分析 K—means算法 初始中心点 Mean—Shift算法 主成分分析 最小生成树 clustering analysis K-means algorithm initial centroid Mean-Shift algorithm Principal Component Analysis(PCA) Minimum Spanning Tree(MST)
  • 相关文献

参考文献17

  • 1Banfield J,Raftery A.Model-based Gaussian and Non-gaussian Clustering[J].Biometrics,1993,49(1):803-821.
  • 2Linde Y,Buzo A,Gray R M.An Algorithm for Vector Quantizer Design[J].IEEE Transactions on Communications,1980,28(1):84-95.
  • 3Huang C M,Harris R W.A Comparison of Several Vector Quantization Codebook Generation Approaches[J].IEEE Transactions on Image Processing,1993,2(1):108-112.
  • 4Redmond S J,Heneghan C.A Method for Initialising the K-means Clustering Algorithm Using Kd-trees[J].Pattern Recognition Letters,2007,28(8):965-973.
  • 5Kaufman L,Rousseeuw P J.Finding Groups in Data:An Introduction to Cluster Analysis[M].New Jersey,USA:Wiley Interscience,1990.
  • 6Pe?a J M,Lozano J A,Larra?aga P.An Empirical Comparison of Four Initialization Methods for the K-means Algorithm[J].Pattern Recognition Letters,1999,20(10):1027-1040.
  • 7汪中,刘贵全,陈恩红.一种优化初始中心点的K-means算法[J].模式识别与人工智能,2009,22(2):299-304. 被引量:140
  • 8Katsavounidis I,Kuo C C J,Zhang Z.A New Initialization Technique for Generalized Lloyd Iteration[J].IEEE Signal Processing Letters,1994,10(1):144-146.
  • 9Galluccio L,Michel O,Comon P,et al,Graph Based K-means Clustering[J].Signal Processing,2012,92(9):1970-1984.
  • 10Bishnu P S,Bhattacherjee V.Software Fault Prediction Using Quad Tree-based K-means Clustering Algorithm[J].IEEE Transactions on Knowledge and Data Engineering,2012,24(6):1146-1150.

二级参考文献26

  • 1李永森,杨善林,马溪骏,胡笑旋,陈增明.空间聚类算法中的K值优化问题研究[J].系统仿真学报,2006,18(3):573-576. 被引量:39
  • 2钱线,黄萱菁,吴立德.初始化K-means的谱方法[J].自动化学报,2007,33(4):342-346. 被引量:32
  • 3Han J, Kamber M. Data Mining Concepts and Techniques. Orlando, USA: Morgan Kaufmann Publishers, 2001
  • 4Huang J Z, Ng M K, Rang Hongqiang, et al. Automated Variable Weighting in K-means Type Clustering. IEEE Trans on Pattern Analysis and Machine Intelligence, 2005, 27 (5) : 657 - 668
  • 5Dhillon I S, Guan Yuqiang, Kogan J. Refining Clusters in High Dimensional Text Data//Proc of the 2nd SIAM Workshop on Clustering High Dimensional Data. Arlington, USA, 2002 : 59 - 66
  • 6Zhang B. Generalized K-Harmonic Means: Dynamic Weighting of Data in Unsupervised Learning//Proc of the 1 st SIAM International Conference on Data Mining. Chicago, USA, 2001 : 1 - 13
  • 7Sarafis I, Zalzala A M S, Trinder P W. A Genetic Rule-Based Data Clustering Toolkit//Proc of the Congress on Evolutionary Computation. Honolulu, USA, 2002 : 1238 - 1243
  • 8Ma J, Perkins S. Time-Series Novelty Detection Using One-Class Support Vector Machines// Proc of the International Joint Conference on Neural Networks. Portland, USA, 2003, Ⅲ: 1741 - 1745
  • 9Kaufman L,Rousseeuw P J. Finding Groups in Data: An Introduction to Cluster Analysis. New York, USA: John Wiley & Sons, 1990
  • 10Rui Xu, Wunsch D I I. Survey of Clustering Algorithms. IEEE Trans on Neural Networks, 2005, 16(3 ) : 645 -678

共引文献141

同被引文献41

  • 1贺玲,吴玲达,蔡益朝.数据挖掘中的聚类算法综述[J].计算机应用研究,2007,24(1):10-13. 被引量:228
  • 2倪巍伟,陈耿,孙志挥.一种基于数据垂直划分的分布式密度聚类算法[J].计算机研究与发展,2007,44(9):1612-1617. 被引量:8
  • 3孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008(1):48-61. 被引量:1079
  • 4Jiang Dongyang, Zheng Wei, Lin Xiaoqing. Research on selec- tion of initial center points based on improved K-Means algo- rithrn [C] //2nd International Conference on Computer Science and Network Technology, 2012: 1146-1149.
  • 5Yao Mingyu, Pi Deehang, Cong Xiangxiang. Chinese text clustering algorithm based K-Means [C] //Proceedings of In- ternational Conference on Services Science, Management and Engineering, 2010: 9-12.
  • 6Zhang Junhao, Ha Minghu, WU Jing. Implementation of rough fuzzy K-Means clustering algorithm in matlab [C] //In- ternationa Conference on Machine Learning and Cybernetics, 2010: 2084-2087.
  • 7Chandrasekhar Am, Raghuveer K. Intrusion detection tech- nique by using K-Means, fuzzy neural network and SVM clas- sifiers [C] //International Conference onComputer Communi- cation and Informatics, 2013: 1-7.
  • 8Verma Vieky Kumar, Khanna Nitin. Indian language identifi- cation using K-Means clustering and support vector machine (SVM) [C] //Students Conference on Engineering and Sys- tems, 2013: 1-5.
  • 9Gan Zhigang, Xiao Nanfeng. A new ensemble learning algo- rithm based on improved K-Means for training neural network ensembles [C] //Second International Symposium on Intelli- gent Information Technology and Security Informatics, 2009: 8-11.
  • 10Megler V M,Maier D. When Big Data Leads to Lost Data [ C ]//Proceedings of the 5th International Conference on Information and Knowledge Management. New York, USA : ACM Press,2012 : 1-8.

引证文献5

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部