期刊文献+

基于网格带有参考参数的扩展聚类算法 被引量:4

Extended Grid-based Clustering Algorithm with Referential Parameters
下载PDF
导出
摘要 作者在前期研究工作中提出了一种基于网格的带有参考参数的聚类算法(GR-PC),该算法从用户的角度去看待聚类,最大程度地避免用户设置聚类参数的盲目性.本文对GRPC算法在高维性和可伸缩性两方面进行了扩展,将高维数据空间的聚类工作分解到二维数据空间来进行,并采用随机抽样技术来处理大规模的数据集.实验仿真表明,该算法能在三维及其以上的数据空间有效地聚类较大规模数据集. By calculating density threshold data, some effective referential parameters were worked out and provided for users, and a new kind of clustering algorithm called GRPC was presented. With the help of these referential parameters, we could not only cluster general data but also segregate high-density clusters from lowdensity clusters. The problem of low quality of clusters of traditional grid clustering algorithm was solved when we usually ignored the distribution of data on partitioning grid. Experiment results have proved that this new algorithm can differentiate between outliers or noises and dusters effectively and discover dusters of arbitrary shapes, with good clustering quality.
出处 《湖南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2009年第2期48-52,共5页 Journal of Hunan University:Natural Sciences
基金 国家自然科学基金资助项目(10572048,50677069)
关键词 网格 密度阈值 聚类算法 数据挖掘 grid clustering density threshold clustering algorithm data mining
  • 相关文献

参考文献9

  • 1JIAWEI H, MICHELINE K. Data mining concepts and techniques[ M]. San Francisco: Morgan Kaufmarm Publishers, 2003.
  • 2KAUFMAN L, ROUSSEEUW P J. Finding groups in data:an introduction to cluster analysis [ M.], New York: John Wiley&Sons, 1990.
  • 3NG R,HAN .L Efficient and effective clustering method for spatial data mining[C]//Proc1994 Int Cord Very Large Data Bases (VLDB'94). Santiago, Chile, 1994:144 - 155.
  • 4ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proceeding of 2nd Int Conf On Knowledge Discovery and Data Mining. Portland: AAAI Press, 1996:226-231.
  • 5KARYPIS G, HAN E H, KUMAR V. Chameleon: a hierarchical clustering algorithm using dynamic modeling[J]. IEEE Computer, 1999, 32(8): 68-75.
  • 6UHA S, RASTOGI R, SHIM K. CURE: an effieient clustering algorithm for large databases[C]//A Proe of ACM SIGMOD International Conference on Managernent of Data. Seattle: ACM Press, 1998: 73 - 84.
  • 7ERTOZ L, STEINBACH M, KUMAR V. Finding clusters of difexent sizes, shapes and densities in noisy[ C]//High Dimension: al Data. Canada: SIAM Press,2003:1 - 12.
  • 8邱保志,张西芝.基于网格的参数自动化聚类算法[J].郑州大学学报(工学版),2006,27(2):91-93. 被引量:14
  • 9ZHOU Y T,YI X D,WU Z G. A grid-based clustering algorithm with referential value of parameters[C]//The Proceeding of International Symposium on Computer Science and Technology. Printed in the United States of America and China: The American Scholars Press, 2007 : 210 - 214.

二级参考文献5

  • 1ESTER M, KRIEGEL H P, SANDER J, et al. A density- based algorithm for discovering clusters in large spatial databases with noise [ A ]. Proceeding of 2nd int Conf On Knowledge Discovery and Data Mining[ C ], Portland : AAAI Press, 1996. 226 - 231.
  • 2KARYPIS G, HAN E H, KUMAR V. Chameleon: A hierarchical clustering algorithm using dynamic modeling[J]. IEEE Computer,1999,32(8) :68 - 75.
  • 3GUHA S, RASTOGI R,SHIM K. CURE: An Efficient Clustering Algorithm for Large Databases [ C ]. New York :ACM Press, 1998.73 - 84.
  • 4ERTOZ L, STEINBACH M, KUMAR V. Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data [ C ]. Canada: SIAM Press, 2003,2003.1 - 12.
  • 5HSU C M, CHEN M S. Subspace Clustering of High Dimensional Spatial Data with Noises [ C ]. Germany:Springer, 2004.31- 40.

共引文献13

同被引文献50

  • 1王建会,申展,胡运发.一种实用高效的聚类算法[J].软件学报,2004,15(5):697-705. 被引量:26
  • 2李基拓,陆国栋.基于边折叠和质点弹簧模型的网格简化优化算法[J].计算机辅助设计与图形学学报,2006,18(3):426-432. 被引量:16
  • 3HanJW,KambrM.数据挖掘概念与技术.第2版.北京:机械工业出版社,2001.251-305.
  • 4Jiang DX, Tang C, Zhang AD. Cluster analysis for gene expression data: A survey. IEEE Trans. on Knowledge and Data Engineering, 2004,16(11): 1370-1386.
  • 5Wang K J, Wang B J, Peng LQ. CVAP: Validation for Cluster Analyses. Data Science Journal,2009,8(20):88-93.
  • 6Kuncheva LI, Vetrov DP. Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2006,28(11): 1798-1808.
  • 7Ng AY, Jordan MI, Weiss Y. On Spectral Clustering:Analysis and an Algorithm. Proc. of 14th Advances in Neural Information Processing Systems. 2001 849-856.
  • 8Lange T, Roth V, Braun NIL, Buhmann JM. Stability-Based Validation of Clustering Solutions. Neural Computation, 2004,16(6): 1299-1323.
  • 9Han Jiawei, Micheline Kamber, Written; Fan ming Meng Xi- aofeng, Trans. Data Minging: Concepts and Techniques [ M]. 2^nd ed, Beijing: China Machine Press, 2007.
  • 10Anil K. Jain. Data clustering: 50years beyond K-means[J].Pat tern Recognition Letters, 2010,31 (8) : 651-666.

引证文献4

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部