期刊文献+

基于优化初始中心点的K-means文本聚类算法 被引量:8

A K-means Text Clustering Algorithm Based on Optimizing Initial Points
下载PDF
导出
摘要 K-means算法终止于一个局部最优状态,所以初始中心点的选择会在很大程度上影响其聚类效果。该文针对K-means算法所存在的问题,提出了一种优化初始中心点的算法。实验表明可以有效减少迭代次数并提高聚类精度,最终获得较好的聚类效果。 K-means algorithm terminates at a local optimum,it is sensitive to initial starting condition.An improved algorithm is proposed,compared with the traditional algorithms,the proposed algorithm can get initial centers with higher quality.The method can achieve higher clustering accuracies.
作者 张世博
出处 《计算机与数字工程》 2011年第10期30-31,共2页 Computer & Digital Engineering
关键词 K均值 聚类 初始中心点 K-means clustering initial center point
  • 相关文献

参考文献9

二级参考文献20

  • 1刘远超,王晓龙,徐志明,关毅.文档聚类综述[J].中文信息学报,2006,20(3):55-62. 被引量:65
  • 2[1]Usama M.Fayyad Cory A.Reina Paul S.Bradley,Initialization of Iterative Refinement Clustering Algorithms[C].Proc.4th International Conf.On Knowledge Discovery & Data Mining,1998.
  • 3[2]Pena J M ,J.A.Lozano,and P.Larranaga,An Empirical Comparison of four Initialization Methods for the K-Means Algorithm[J].Pattern Recognition Letters, 1999,20:1027-1040.
  • 4[3]Pal N R and J.C.Bezdek,On Cluster Validity for the Fuzzy c-Means Model,IEEE Transactions on Fuzzy Systems[J].1995,3:370-390.
  • 5[4]Rezaee M R, B P F Lelieveldt and J.H.C.Reiber,A New Cluster Validity Index for Fuzzy c-Means[J].Pattern Recognition Letters ,1998,19:237-246.
  • 6[5]Ray S and R H Turi,Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation[C].ICAPRDT'99,Calcutta,India,27-29 December,1999.
  • 7SAVARESI S M,BOLEY D.On the performance of bisecting k-means and PDDP[C].Proceedings of the 1st SIAM International Conference on DataMining, 2001 : 1-14.
  • 8HAMMOUDA K,KAMEL M.Collaborative document clustering[C].2006 SIAM Conference on Data Mining(SDM06), 2006 : 453-463.
  • 9Fraley C,Raftery A E.How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. Department of Statistics University of Washington Technical Report, 1998.
  • 10Xu L.How Many Clustering?:A YING-YANG Machine Based Theory For A Classical Open Problem In Pattern Recognition. IEEE Trans, Neural Networks, 1996 ; 3(10).

共引文献29

同被引文献78

  • 1郝拉娣,于化东.标准差与标准误[J].编辑学报,2005,17(2):116-118. 被引量:22
  • 2周涓,熊忠阳,张玉芳,任芳.基于最大最小距离法的多中心聚类算法[J].计算机应用,2006,26(6):1425-1427. 被引量:71
  • 3宋勇,陈贤富,姚海东.随机数发生器探讨及一种真随机数发生器实现[J].计算机工程,2007,33(2):71-73. 被引量:13
  • 4彭京,杨冬青,唐世渭,付艳,蒋汉奎.一种基于语义内积空间模型的文本聚类算法[J].计算机学报,2007,30(8):1354-1363. 被引量:44
  • 5MACQUEEN J.Some methods for classification and analysis of multivariate observations[C]//Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability.1967:14.
  • 6DEFAYS D.An efficient algorithm for a complete link method[J].The Computer Journal,1977,20(4):364-366.
  • 7FUNG B C,WANG K,ESTER M.Hierarchical document clustering using frequent itemsets[C]//Proceedings of the SIAM International Conference on Data Mining.2003:59-70.
  • 8BAKUS J,HUSSIN M,KAMEL M.A SOM-based document clustering using phrases[C]//Proceedings of the 9th International Conference.2002:2212-2216.
  • 9ROMERO F P,PERALTA A,SOTO A,et al.Fuzzy optimized self-organizing maps and their application to document clustering[J].Soft Computing-A Fusion of Foundations,Methodologies and Applications,2010,14(8):857-867.
  • 10LUO X,XU Z,YU J,et al.Building association link network for semantic link on web resources[J].Automation Science and Engineering,2011,8(3):482-494.

引证文献8

二级引证文献136

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部