期刊文献+

一个近似的线性时间聚类算法

An Approximated Linear Clustering Method
下载PDF
导出
摘要 聚类分析是数据挖掘的一个重要研究方向,而PAM算法是聚类算法中一个重要的方法.本文针对PAM算法不适应大数据集的缺点,给出一个近似的线性时间聚类算法(ALCM),并且从理论上证明了该算法复杂度为关于数据集个数的线性时间复杂度.通过比较实验表明:1)随着数据个数的增大,PAM所花费的时间将激剧增大,而ALCM花费时间与数据集个数呈近似线性增长的关系,即ALCM是适应大数据集的.2)PAM算法和ALCM算法随数据个数增大,二者的代价函数并无明显差异. Cluster is an important research direction and the PAM algorithm is one ot the most important method. But the PAM can work well with large data set. To solve the problem, this paper shows an Approximated Linear Clustering Method (ALCM), and proves that the complexity of the new algorithm is O(n), where n is the number of data set. The comparing experiment shows that the performance of ALCM method is higher than the PAM with large data set, and it is not obviously different between two methods about the value of Cost function.
作者 孙军华
出处 《广西师范学院学报(自然科学版)》 2005年第3期80-84,共5页 Journal of Guangxi Teachers Education University(Natural Science Edition)
关键词 聚类分析 线性时间 算法 数据挖掘 cluster linear time algorithm data mining
  • 相关文献

参考文献5

  • 1J MacQueen. Some methods for classification and analysis of multivariate observations[J]. Proc 5th Berkeley Symp Math.Statist, Prob, 1967 1:281 - 297.
  • 2L Kaufman, P J Rousseeuw. Finding Groups in Data: An introduction to cluster anaysis [ M]. New youk: john wiley&sons, 1990.
  • 3R Ng, J Han Efficient and effective clustering method for spatial data mining. In Proc. 1994 Int. Conf. Very Large Data Base(VLDB'94) ,144- 155, Santiago, Chile, 1994.
  • 4元昌安,唐常杰,张天庆,陈安龙,左劼,谢方军.基于Hash函数取样的线性时间聚类方法LCHS[J].小型微型计算机系统,2005,26(8):1364-1368. 被引量:2
  • 5元昌安,唐常杰,温远光,胡建军,彭京.基于基因表达式编程的智能模型库系统的实现[J].四川大学学报(工程科学版),2005,37(3):99-104. 被引量:11

二级参考文献14

  • 1段磊,唐常杰,左劼,陈宇,钟义啸,元昌安.基于基因表达式编程的抗噪声数据的函数挖掘方法[J].计算机研究与发展,2004,41(10):1684-1689. 被引量:39
  • 2元昌安,唐常杰,左劼,谢方军,陈安龙,胡建军.基于基因表达式编程的函数挖掘——收敛性分析与残差制导进化算法[J].四川大学学报(工程科学版),2004,36(6):100-105. 被引量:44
  • 3琚春华,王光明,陈晓.商业决策支持系统的模型库系统研究[J].系统工程,1997,15(3):12-16. 被引量:13
  • 4Jiawei Han0 Micheline Kamber. Data mining: Concepts and techniques[M]. Morgan Kaufmann Publishers, 2001.
  • 5MacQueen J. Some methods for classification and analysis of multivariate observations [C]. Proc. 5th Berkeley Symp. Math Statist, Prob. , 1967,1: 281-297.
  • 6Kaufman L and Rousseeuw P J. Finding groups in data:An introduction to cluster anaysis [M]. New youk: Johnwiley&Sons, 1990.
  • 7Ng R, Han J. Efficient and effective clustering method for spatiall data mining[C]. In Proc. 1994 Int. Conf. Very Large Data Base(VLDB'94) ,144-155, Santiago, Chile,Sept. 1994.
  • 8Murray R Spiegel, Larry J Stephens. Schaum's outline of theory and problems of statistics, Third Edition[M]. McGraw-Hill Companies, Inc. 1999.
  • 9Leslie Kish. Survey sampling[M]. John Wiley & Sons. Inc.1985.
  • 10Jain A K, Dubes R C. Algorithms for clustering data[M].Prentice-Hall, 1988.

共引文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部