一种多空间聚类算法被引量：6

A Multi-Space Clustering Algorithm

下载PDF

导出

摘要 CLARANS算法是经典的划分聚类算法,其核心思想是采用随机重启的局部搜索方式搜索中心点.由于搜索空间布满了局部最优解的“陷阱”,因此它难以获得全局最优解,从而影响了聚类质量.针对这个缺点,本文将多空间思想与CLARANS算法相结合,提出了基于多空间思想的CLARANS算法—CABMS(CLARANSAlgorithmBasedonMulti-Space).该算法的基本思路是采用空间变换策略构造一系列光滑程度不同的搜索空间,在不同的搜索空间中执行CLARANS算法,并利用前层搜索空间的聚类结果来引导本层搜索空间的聚类.CABMS能够跳过局部最优解的“陷阱”,增大获得全局最优解的概率,达到提高聚类质量的目的.本文给出了等距法多空间构造策略,并通过实验对比了CLARANS算法与CABMS算法的聚类质量.实验结果表明,CABMS的聚类质量较CLARANS有较大改进. As a classical partition clustering algorithm, CLARANS uses local search with random restart to find clusters＇ central points. Due to the great number of local optimum ＂traps＂, it is hard for CLARANS to find the global optimum solutions. This paper incorporates the Multi-Space theory into CLARANS, and proposes one new algorithm CLARANS Algorithm Based on Multi-Space （CABMS）. Its basic idea is to construct a series of smoothed search spaces with space transformation strategy, and runs CLARANS in every search space. CABMS uses the solution in the former search space as an initial solution for the current space. Through such way, CABMS could jump out of the local optimum ＂traps＂, and improve the probability of getting global optimum. This paper gives out Displacement strategy for space transformation, and experiment results indicated that the CABMS algorithm had achieved remarkable improvement over CLARANS in terms of quality.

作者赵东东宗瑜江贺张宪超

机构地区大连理工大学软件学院

出处《小型微型计算机系统》 CSCD 北大核心 2006年第12期2297-2300,共4页 Journal of Chinese Computer Systems

基金国家自然科学基金项目(90412007)资助国家自然科学基金项目(60503003)资助辽宁省博士启动基金项目(20051082)资助大连理工大学青年教师培养基金资助.

关键词聚类多空间 CLARANS clustering multi-space CLARANS

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献14

1Pavel Berkhin.Survey of clustering data mining techniques[R].Technical Report,Accrue Software,2002.
2钱卫宁,周傲英.从多角度分析现有聚类算法(英文)[J].软件学报,2002,13(8):1382-1394. 被引量：86
3Hartigan J,Wong M.A k-means clustering algorithm[J].Applied Statistics,1979,(28):100-108.
4Raymond T.Ng,Han Jia-wei.Efficient and effective clustering methods for spatial data mining[C].Proceeding of the 20th VLDB Conference Santiago,Chile,1994.
5Ordonez C,Omiecinski E.FREM:fast and robust EM clustering for large data sets[C/OL].In:ACM CIKM Conference,2002.590-599.http://citeseer.ist.psu.edu/536108.html.
6Karypis G,Han E H,KUMAR V.Chameleon:a hierarchical clustering algorithm using dynamic modeling[J].Computer,1999,32:68-75.
7Zhang T,Ramakrishna R,Livny M.BIRCH:a new data clustering algorithm and its applications[J].Journal of Data Mining and Knowledge Discovery,1997,1(2):141-182.
8Sheikholeslami G,Chatterjee S,Zhang A.WaveCluster:a multi-resolution clustering approach for very large spatial databases[C].In:Proceedings of the 24th Conference on VLDB,New York,NY,1998:428-439.
9Wang W,Yang J,Muntz R.STING:a statistical information grid approach to spatial data mining[C].In:Proceedings of the 23rd Conference on VLDB.Athens,Greece,1997:186-195.
10Rakesh A,Johanners G,Dimitrios G,et al.Automatic subspace clustering of high dimensional data for data mining applications[A].In:Snodgrass RT,Winslett M,eds[C].Proc.of the 1994 ACM SIGMOD Intl Conf.on Management of Data.Minneapolis:ACM Press,1994:94-105.

二级参考文献36

1[1]Fasulo, D. An analysis of recent work on clustering algorithms. Technical Report, Department of Computer Science and Engineering, University of Washington, 1999. http://www.cs.washington.edu.
2[2]Baraldi, A., Blonda, P. A survey of fuzzy clustering algorithms for pattern recognition. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 1999,29:786～801.
3[3]Keim, D.A., Hinneburg, A. Clustering techniques for large data sets - from the past to the future. Tutorial Notes for ACM SIGKDD 1999 International Conference on Knowledge Discovery and Data Mining. San Diego, CA, ACM, 1999. 141～181.
4[4]McQueen, J. Some methods for classification and Analysis of Multivariate Observations. In: LeCam, L., Neyman, J., eds. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 1967. 281～297.
5[5]Zhang, T., Ramakrishnan, R., Livny, M. BIRCH: an efficient data clustering method for very large databases. In: Jagadish, H.V., Mumick, I.S., eds. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. Quebec: ACM Press, 1996. 103～114.
6[6]Guha, S., Rastogi, R., Shim, K. CURE: an efficient clustering algorithm for large databases. In: Haas, L.M., Tiwary, A., eds. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. 73～84.
7[7]Beyer, K.S., Goldstein, J., Ramakrishnan, R., et al. When is 'nearest neighbor' meaningful? In: Beeri, C., Buneman, P., eds. Proceedings of the 7th International Conference on Data Theory, ICDT'99. LNCS1540, Jerusalem, Israel: Springer, 1999. 217～235.
8[8]Ester, M., Kriegel, H.-P., Sander, J., et al. A density-based algorithm for discovering clusters in large spatial databases with noises. In: Simoudis, E., Han, J., Fayyad, U.M., eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD'96). AAAI Press, 1996. 226～231.
9[9]Ester, M., Kriegel, H.-P., Sander, J., et al. Incremental clustering for mining in a data warehousing environment. In: Gupta, A., Shmueli, O., Widom, J., eds. Proceedings of the 24th International Conference on Very Large Data Bases. New York: Morgan Kaufmann, 1998. 323～333.
10[10]Sander, J., Ester, M., Kriegel, H.-P., et al. Density-Based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 1998,2(2):169～194.

共引文献85

1刘英林,陈玉柱,丁文静,程红云.钢卷表面缺陷分布特征发现方法研究[J].冶金自动化,2020,44(1):27-31. 被引量：2
2毛颖颖,杨新凯.融合拓扑势的自适应层次聚类算法研究[J].计算机应用研究,2020,37(S01):37-39.
3李华,贾雪.基于FM度量的自适应K-Means聚类的工业生产运行基准挖掘[J].长春大学学报,2022,32(4):22-27.
4Qi Zhang,Jianshe Cao,Yanfeng Sui.Development of a research platform for BEPCⅡ accelerator fault diagnosis[J].Radiation Detection Technology and Methods,2020,4(3):269-276.
5郭景峰,赵玉艳,边伟峰,李晶.基于改进的凝聚性和分离性的层次聚类算法[J].计算机研究与发展,2008,45(z1):202-206. 被引量：15
6王建会,申展,胡运发.一种实用高效的聚类算法[J].软件学报,2004,15(5):697-705. 被引量：26
7张虎,郑家恒,刘江.语料库词性标注一致性检查方法研究[J].中文信息学报,2004,18(5):11-16. 被引量：9
8杨涛,李龙澍.一种基于粗糙集聚类的数据约简算法[J].系统仿真学报,2004,16(10):2195-2197. 被引量：5
9张虎,郑家恒,刘江.汉语语料库词性标注自动校对方法研究[J].计算机应用,2005,25(1):17-19. 被引量：1
10栾丽华,吉根林.一种基于四叉树的快速聚类算法[J].计算机应用,2005,25(5):1001-1003. 被引量：6

同被引文献38

1马迎霜,张昊民.顾客满意与顾客忠诚关系分析[J].现代企业,2006(1):59-60. 被引量：8
2刘英姿,吴昊.客户细分方法研究综述[J].管理工程学报,2006,20(1):53-57. 被引量：85
3李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量：113
4赵国富,曲国庆.聚类分析中CLARA算法的分析与实现[J].山东理工大学学报（自然科学版）,2006,20(2):45-48. 被引量：9
5王杰,李冬梅.数据挖掘在网络入侵检测系统中的应用[J].微计算机信息,2006,22(04X):73-75. 被引量：15
6张建萍,刘希玉.基于聚类分析的K-means算法研究及应用[J].计算机应用研究,2007,24(5):166-168. 被引量：123
7Zhang T, Ramakrishnan R, Livny M. An efficient data clustering method for very large databases[ C ]//Proc of the 1996 ACM SIGMOD Int'l Conf on Management of Data, Montreal,Quebec : ACM Press, 1996 : 103 -114.
8Guha S, Rastogi R, Shim K. An efficient clustering algorithm for large databases[ C]//Proc of 1998 ACM SIGMOD Int'l Conf on Management of Data, Seattle, Washington : ACM Press, 1998 : 73 - 84.
9Karypis G, Han E H, Kumarl V. A hierarchical clustering algorithm using dynamic modeling[ J ]. Computer, 1999,32 (8) :68 -75.
10T Zhang, R Ramakrishnan, M Livny, BIRCH, An efficient data clustering method for very large databases [ C ]. In: Proc of the 1996 ACM SIGMOD Int'l Conf on Management of Data. Montreal, Quebec: ACM Press, 1996. 103-114.

引证文献6

1李鑫,单维峰,丰继林,李军,高方平,李忠.一种基于聚类的异常检测方法[J].传感器与微系统,2011,30(1):19-21. 被引量：3
2白旭,靳志军.K-中心点聚类算法优化模型的仿真研究[J].计算机仿真,2011,28(1):218-221. 被引量：10
3陆春桃.基于约束模糊聚类的抗电力雷电波干扰信号检测[J].科技通报,2013,29(10):109-111.
4郭崇.电子商务配送速度与客户退单的关联建模仿真[J].计算机仿真,2014,31(10):455-459.
5时彩颖.夜间视觉变弱状况对搜救成功率影响模型仿真[J].计算机仿真,2015,32(7):331-334. 被引量：2
6吴昊,王冠凌.烟花搜索导向的多路启发式聚类算法[J].陕西理工大学学报（自然科学版）,2019,35(3):40-45.

二级引证文献15

1刘建华,王进,杨洪春,孟颖.基于ACO-PAM综合算法的电力负荷聚类分析[J].电力科学与技术学报,2011,26(4):94-99. 被引量：6
2叶安新,邓大勇.基于改进量子遗传算法的聚类算法[J].计算机仿真,2013,30(4):275-278. 被引量：5
3胡伟.一种改进的动态k-均值聚类算法[J].计算机系统应用,2013,22(5):116-121. 被引量：8
4代亮,谢晓尧.改进的量子遗传进化激励的快速数据分类算法[J].计算机仿真,2014,31(2):340-343. 被引量：3
5罗志增,王新栋,唐增.静立平衡压力中心参数的年龄性别因素研究[J].航天医学与医学工程,2014,27(6):425-429. 被引量：2
6刘喜梅,刘义芳,高林.小样本道路旅行时间数据中的异常点剔除算法[J].青岛科技大学学报（自然科学版）,2015,36(3):346-349. 被引量：3
7赵翠芹,易云飞.无线传感网中分簇分层k-medoids协议研究[J].云南民族大学学报（自然科学版）,2016,25(2):157-162. 被引量：2
8张娟.视频监测区的特定视觉特征快速定位方法研究[J].计算机仿真,2016,33(3):355-358. 被引量：2
9高宝鑫,王东.约束聚类下的B2C电商物流网络区域聚合[J].哈尔滨商业大学学报（自然科学版）,2016,32(1):71-76. 被引量：1
10王刚,李存文,梁正玉,杨涛,张燕平,高伟.基于特征通流面积的汽轮机流量特性辨识方法[J].热力发电,2016,45(6):33-39. 被引量：12

1李光宇.基于改进的CLARANS算法在数据挖掘中的研究[J].中南林业科技大学学报,2010,30(3):142-146. 被引量：4
2段明秀.QPSO优化的改进CLARANS聚类算法[J].计算机工程与应用,2013,49(9):168-170. 被引量：3
3王宁,王浩.对一种基于划分的聚类算法CLARANS的改进[J].皖西学院学报,2009,25(2):26-29. 被引量：1
4苏勇,黄烨,周冬.基于网格结构的二次CLARANS聚类算法[J].计算机应用与软件,2013,30(3):287-290. 被引量：2
5叶质刚.物联网水产品追踪算法研究[J].电脑知识与技术,2016,0(6):135-136.
6张书春,孙秀英.基于网格结构的CLARANS改进算法[J].计算机工程,2012,38(6):56-59. 被引量：7
7刘琨,周铁军.基于遗传算法和CLARANS的一种改进聚类算法[J].计算机与现代化,2008(3):93-94. 被引量：1
8杨增芳.一种基于分布的空间聚类算法[J].玉溪师范学院学报,2012,28(4):48-53. 被引量：1
9何童.不确定性目标的CLARANS聚类算法[J].计算机工程,2012,38(11):56-58. 被引量：2
10陈宝国,荀小苗.基于改进CLARANS算法的孤立点检测[J].计算机与数字工程,2008,36(7):15-17. 被引量：1

小型微型计算机系统

2006年第12期

浏览历史

内容加载中请稍等...

一种多空间聚类算法被引量：6

参考文献14

二级参考文献36

共引文献85

同被引文献38

引证文献6

二级引证文献15

相关作者

相关机构

相关主题

浏览历史

一种多空间聚类算法 被引量：6

参考文献14

二级参考文献36

共引文献85

同被引文献38

引证文献6

二级引证文献15

相关作者

相关机构

相关主题

浏览历史

一种多空间聚类算法被引量：6