期刊文献+

一种多空间聚类算法 被引量:6

A Multi-Space Clustering Algorithm
下载PDF
导出
摘要 CLARANS算法是经典的划分聚类算法,其核心思想是采用随机重启的局部搜索方式搜索中心点.由于搜索空间布满了局部最优解的“陷阱”,因此它难以获得全局最优解,从而影响了聚类质量.针对这个缺点,本文将多空间思想与CLARANS算法相结合,提出了基于多空间思想的CLARANS算法—CABMS(CLARANSAlgorithmBasedonMulti-Space).该算法的基本思路是采用空间变换策略构造一系列光滑程度不同的搜索空间,在不同的搜索空间中执行CLARANS算法,并利用前层搜索空间的聚类结果来引导本层搜索空间的聚类.CABMS能够跳过局部最优解的“陷阱”,增大获得全局最优解的概率,达到提高聚类质量的目的.本文给出了等距法多空间构造策略,并通过实验对比了CLARANS算法与CABMS算法的聚类质量.实验结果表明,CABMS的聚类质量较CLARANS有较大改进. As a classical partition clustering algorithm, CLARANS uses local search with random restart to find clusters' central points. Due to the great number of local optimum "traps", it is hard for CLARANS to find the global optimum solutions. This paper incorporates the Multi-Space theory into CLARANS, and proposes one new algorithm CLARANS Algorithm Based on Multi-Space (CABMS). Its basic idea is to construct a series of smoothed search spaces with space transformation strategy, and runs CLARANS in every search space. CABMS uses the solution in the former search space as an initial solution for the current space. Through such way, CABMS could jump out of the local optimum "traps", and improve the probability of getting global optimum. This paper gives out Displacement strategy for space transformation, and experiment results indicated that the CABMS algorithm had achieved remarkable improvement over CLARANS in terms of quality.
出处 《小型微型计算机系统》 CSCD 北大核心 2006年第12期2297-2300,共4页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(90412007)资助 国家自然科学基金项目(60503003)资助 辽宁省博士启动基金项目(20051082)资助 大连理工大学青年教师培养基金资助.
关键词 聚类 多空间 CLARANS clustering multi-space CLARANS
  • 相关文献

参考文献14

  • 1Pavel Berkhin.Survey of clustering data mining techniques[R].Technical Report,Accrue Software,2002.
  • 2钱卫宁,周傲英.从多角度分析现有聚类算法(英文)[J].软件学报,2002,13(8):1382-1394. 被引量:86
  • 3Hartigan J,Wong M.A k-means clustering algorithm[J].Applied Statistics,1979,(28):100-108.
  • 4Raymond T.Ng,Han Jia-wei.Efficient and effective clustering methods for spatial data mining[C].Proceeding of the 20th VLDB Conference Santiago,Chile,1994.
  • 5Ordonez C,Omiecinski E.FREM:fast and robust EM clustering for large data sets[C/OL].In:ACM CIKM Conference,2002.590-599.http://citeseer.ist.psu.edu/536108.html.
  • 6Karypis G,Han E H,KUMAR V.Chameleon:a hierarchical clustering algorithm using dynamic modeling[J].Computer,1999,32:68-75.
  • 7Zhang T,Ramakrishna R,Livny M.BIRCH:a new data clustering algorithm and its applications[J].Journal of Data Mining and Knowledge Discovery,1997,1(2):141-182.
  • 8Sheikholeslami G,Chatterjee S,Zhang A.WaveCluster:a multi-resolution clustering approach for very large spatial databases[C].In:Proceedings of the 24th Conference on VLDB,New York,NY,1998:428-439.
  • 9Wang W,Yang J,Muntz R.STING:a statistical information grid approach to spatial data mining[C].In:Proceedings of the 23rd Conference on VLDB.Athens,Greece,1997:186-195.
  • 10Rakesh A,Johanners G,Dimitrios G,et al.Automatic subspace clustering of high dimensional data for data mining applications[A].In:Snodgrass RT,Winslett M,eds[C].Proc.of the 1994 ACM SIGMOD Intl Conf.on Management of Data.Minneapolis:ACM Press,1994:94-105.

二级参考文献36

  • 1[1]Fasulo, D. An analysis of recent work on clustering algorithms. Technical Report, Department of Computer Science and Engineering, University of Washington, 1999. http://www.cs.washington.edu.
  • 2[2]Baraldi, A., Blonda, P. A survey of fuzzy clustering algorithms for pattern recognition. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 1999,29:786~801.
  • 3[3]Keim, D.A., Hinneburg, A. Clustering techniques for large data sets - from the past to the future. Tutorial Notes for ACM SIGKDD 1999 International Conference on Knowledge Discovery and Data Mining. San Diego, CA, ACM, 1999. 141~181.
  • 4[4]McQueen, J. Some methods for classification and Analysis of Multivariate Observations. In: LeCam, L., Neyman, J., eds. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 1967. 281~297.
  • 5[5]Zhang, T., Ramakrishnan, R., Livny, M. BIRCH: an efficient data clustering method for very large databases. In: Jagadish, H.V., Mumick, I.S., eds. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. Quebec: ACM Press, 1996. 103~114.
  • 6[6]Guha, S., Rastogi, R., Shim, K. CURE: an efficient clustering algorithm for large databases. In: Haas, L.M., Tiwary, A., eds. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. 73~84.
  • 7[7]Beyer, K.S., Goldstein, J., Ramakrishnan, R., et al. When is 'nearest neighbor' meaningful? In: Beeri, C., Buneman, P., eds. Proceedings of the 7th International Conference on Data Theory, ICDT'99. LNCS1540, Jerusalem, Israel: Springer, 1999. 217~235.
  • 8[8]Ester, M., Kriegel, H.-P., Sander, J., et al. A density-based algorithm for discovering clusters in large spatial databases with noises. In: Simoudis, E., Han, J., Fayyad, U.M., eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD'96). AAAI Press, 1996. 226~231.
  • 9[9]Ester, M., Kriegel, H.-P., Sander, J., et al. Incremental clustering for mining in a data warehousing environment. In: Gupta, A., Shmueli, O., Widom, J., eds. Proceedings of the 24th International Conference on Very Large Data Bases. New York: Morgan Kaufmann, 1998. 323~333.
  • 10[10]Sander, J., Ester, M., Kriegel, H.-P., et al. Density-Based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 1998,2(2):169~194.

共引文献85

同被引文献38

引证文献6

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部