期刊文献+

大样本数据聚类的改进方法 被引量:5

下载PDF
导出
摘要 K-means算法是处理大样本数据的聚类分析的常用算法之一。该算法的不足之处是聚类的数目k必须事先给定。文章提出应用黄金分割法来度量有关该聚类的有效性,该方法能自动优化确定最佳的聚类个数,以此实现大样本数据的有效聚类;并采用实际数据说明了方法的合理性和有效性。
作者 卞亦文
出处 《统计与决策》 CSSCI 北大核心 2009年第1期12-13,共2页 Statistics & Decision
  • 相关文献

参考文献8

  • 1Mac Queen J. Some Methods for Classification and Analysis of Multivariate Observations[J]. Proceeding of the 5th Berkeley Symposium on Mathematics Statistic Problem, 1967, (1).
  • 2Huang Z. Extensions to The K-means Algorithm for Clustering Large Data Set with Categorical Values [J]. Data Mining and Knowledge Discovery,1998,(2).
  • 3Dubes R C,Jain A K.Validity Studies in Clustering Methodologies[J]. Pattern Recognition, 1979, 12(11).
  • 4姜园,张朝阳,仇佩亮,戚玉鹏.对聚类算法普遍存在问题的解决办法[J].电路与系统学报,2004,9(3):92-99. 被引量:10
  • 5Siddheswar Ray, Rose H. Tuff. Determination of Number of Clusters in K-Means Clustering and Application in Color Image Segmentation[J]. ICAPRDT'99, Calcutta,India,1999,(12).
  • 6Tsunenori Ishioka. Extended K-means with an Efficient Estimation of the Number of Clusters[J]. Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2000), Hong Kong, China, 2000.
  • 7Pal N R and J. C. Bezdek. On Cluster Validity for the Fuzzy cmeans Model[J]. IEEE Transaction on Fuzzy Systems,1995.
  • 8Moguerza J M, Munoz A, Martin-Merino M. Detecting the Number of Clusters Using a Support Vector Machine Approach[J]. International Conference on Artificial Neural Networks-ICANN,2002.

二级参考文献68

  • 1刘静,钟伟才,刘芳,焦李成.免疫进化聚类算法[J].电子学报,2001,29(z1):1868-1872. 被引量:43
  • 2钱云涛,谢维信.一种由模糊逻辑神经元网络实现的聚类分析方法[J].西安电子科技大学学报,1995,22(1):1-7. 被引量:12
  • 3Barbara D, Chen P. Using the fractal dimension to cluster datasets [A]. Proceedings of the 6th ACM SIGKDD [C]. Boston, MA., 2000, 260-264.
  • 4Kandogan E. Visualizing multi-dimensional clusters, trends and outliers using star coordinates [A]. Proceedings of the 7th ACM SIGKDD [C]. San Francisco, CA., 2001, 107-116.
  • 5Bezdek J C. Pattern Recognition With Fuzzy Objective Function Algorithms [M]. New York: Plenums Press, 1981, 95-107.
  • 6Pal N R, Bezdek J C. On Cluster Validity for the Fuzzy C-Means Model [J]. IEEE Trans on Fuzzy System, 1995, 3(3): 370-379.
  • 7Engleman L, Hartigan J. Percentage points of a test for clusters [J]. Journal of the American Statistical Association, 1969, 64: 1647-1648.
  • 8Millgan G, Cooper M. An examination of procedures for determining the number of clusters in a data set [J]. Psychometrika, 1985, 50: 159-179.
  • 9史忠植 刘少辉 郑毅 傅伟鹏 吴斌.一种基于群体智能的Web文档聚类算法[J].计算机研究与发展,2003,39(11).
  • 10Knorr E, Ng R. Algorithms for mining distance-based outliers in large datasets [A]. Proceedings of the 24h Conference on VLDB [C]. New York, 1998, 392-403.

共引文献9

同被引文献29

引证文献5

二级引证文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部