期刊文献+

一种大数据环境下的新聚类算法 被引量:24

Novel Global Kmeans Clustering Algorithm for Big Data
下载PDF
导出
摘要 提出了一种新的聚类算法NGKCA,该算法克服了经典聚类算法检测率和稳定性的不足,适用于解决大数据环境下的聚类问题。NGKCA聚类算法包括4个阶段:首先利用谱聚类NJW算法对大数据集进行列降维和数据归一化处理,其次引入对初始值不敏感的粒子群算法对数据集进行行降维从而选出临时的聚类中心集,接着通过全局Kmeans算法对最佳聚类中心集进行聚类以获取聚类中心点,最后使用粒子群算法对聚类中心点进行调整进而获取最终的聚类划分。在一些著名的机器学习数据集和国际标准的网络安全数据集KDDCUP99上进行实验,结果表明:提出的算法比谱聚类、Kmeans、粒子群、全局Kmeans等常见算法具有更好的稳定性和更高的检测率,与全局Kmeans算法相比具有更优的时间复杂度。 The clustering method for big data has attracted lots of interest in recent years. This paper proposed a novel global k-means clustering algorithm (NGKCA). The proposed clustering method comprises four phrases, namely row dimension reduction phrase, line dimension reduction phrase, global k-means clustering phrase and the adjustment of clustering center point. The row dimension reduction phrase is realized by means of spectral clustering method,while the line dimension reduction phrase is realized with the aid of particle swarm optimization. Both the row dimension reduction phrase and the line dimension reduction phrase are completed, and then the global k-means clustering phrase and the PSO phrase proceed. The experiments were carried out on some well-known machine learning data set and a standard network security data set KDI)CLIp99. Experimental results show that the proposed NGKCA leads to superior performance in comparison with some common algorithms reported in the literature and the time complexity of the NGKCA is better than the algorithm of global k-means.
出处 《计算机科学》 CSCD 北大核心 2015年第12期247-250,共4页 Computer Science
基金 国家自然科学基金项目(61272450) 天津市科技支撑项目(14ZCZDGX00072)资助
关键词 全局Kmeans 谱聚类 粒子群优化 聚类 KDDCUP99 Global Kmeans,Spectral clustering,PSO,Clustering,KDDCUP99
  • 相关文献

参考文献7

二级参考文献66

  • 1刘向东,沙秋夫,刘勇奎,段晓东.基于粒子群优化算法的聚类分析[J].计算机工程,2006,32(6):201-202. 被引量:26
  • 2贾东立,张家树.基于混沌变异的小生境粒子群算法[J].控制与决策,2007,22(1):117-120. 被引量:50
  • 3方正,佟国峰,徐心和.粒子群优化粒子滤波方法[J].控制与决策,2007,22(3):273-277. 被引量:95
  • 4朱强生,何华灿,周延泉.谱聚类算法对输入数据顺序的敏感性[J].计算机应用研究,2007,24(4):62-63. 被引量:7
  • 5Fowlkes C, Belongie S, Chung F, et al.Spectral grouping using the nystrom method[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,26(2):217-225.
  • 6Ekin A, Pankanti S, Hampapur A.Initialization-independent spectral clustering with applications to automatic video analysis[C]// Proc of IEEE ICASSP.Canada: [s.n.], 2004.
  • 7Ng A Y, Jordan M I, Weiss Y.On spectral clustering: Analysis and an algorithm[C]//Advances in Neural Information Processing Systems.Cambrige,MA:MIT Press,2001:856-897.
  • 8Zhang Bin,Hsu M,Dayal U.K-harmonic means-A spatial clustering algorithm with boosting[C]//Proceedings of the 1st International Workshop on Temporal,Spatial,and Spatio-Temporal Data Mining-Revised Papers.London,UK:Springer-Verlag,2000: 31-45.
  • 9HAN J W,KAMBER M.Data mining concept and techniques[M].范明,孟小峰,译.北京:机械工业出版社,2001.
  • 10HAMAD D,BIELA P.Introduction to spectral clustering[C] // Proceedings of 3rd International Conference on Information and Communication Technologies:From Theory to Applications.New York:IEEE,2008:1-6.

共引文献153

同被引文献197

引证文献24

二级引证文献116

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部