一种基于代表点和点密度的聚类算法被引量：2

New clustering algorithm based on representatives and point density

下载PDF

导出

摘要针对基于密度的聚类方法不能发现密度分布不均的数据样本的缺陷,提出了一种基于代表点和点密度的聚类算法。算法通过检查数据库中每个点的k近邻来寻找聚类。首先选取一个种子点作为类的第一个代表点,其k近邻为其代表区域,如果代表区域中的点密度满足密度阈值,则将该点作为一个新的代表点,如此反复地寻找代表点,这些区域相连的代表点及其代表区域将构成一个聚类。实验结果表明,该算法能够发现任意形状、大小和密度的聚类。 Aimed to solve the problem that the density-based clustering algorithm dose not work well when data distribution is not even,a new clustering algorithm based on representatives and point density is provided.The algorithm discovers the clusters by examining k neighbors of each point in the data base.It chooses a seed point as the first representative and the representative＇s k neighbors as its represent area.If the point in the represent areas satisfies the density threshold,this point will be a new representative.And repeating searching like this,all the linked represent areas and representatives will be a cluster. Experimental results show that this algorithm can discover clusters with arbitrary shapes and densities at different levels.

作者陈园园陈治平

机构地区湖南大学计算机与通信学院

出处《计算机工程与应用》 CSCD 北大核心 2008年第28期136-139,共4页 Computer Engineering and Applications

关键词数据挖掘聚类点密度代表点密度阈值 data mining clustering point density representative density threshold

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献9

1Han JW,Kambr M.Data mining concepts and techniques[M].Beijing:Higher Education Press,2001.
2Ester M,Kregel H P,Sander J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining, Portland , Oregon, U S A,1996.
3Ankerst M,Breunig M,Kriegel H P,et al.OPTICS:Ordering Points To Identify the Clustering Structure[C]//Proc ACM SIGMOD'99,Int Conf on Management of Data,Philadelphia, PA, 1999.
4Lin Chih-Yang,Chang Chin-Chen,Lin Chia-Chen.A new densitybased scheme for clustering based on genetic algorithm[J].Fundamenta Informatieae,2005,68(4) :315-331.
5Ma Daoying,Zhang Aidong.An adaptive density-based clustering algorithm for spatial database with noise[C]//ICDM'04.Fourth IEEE International Conference on Data Mining,1-4 Nov 2004:467-470.
6Dash M,Liu M,Xu X.1 +1 >2 : merging distance and density based clustering[C]//Proc of 7th Int Conf Database Systems for Advanced Applications(DASFAA'01 ) ,Hong Kong,April 2001 : 18-20.
7马帅,王腾蛟,唐世渭,杨冬青,高军.一种基于参考点和密度的快速聚类算法[J].软件学报,2003,14(6):1089-1095. 被引量：108
8Derya Birant,Alp Kut.ST-DBSCAN:an algorithm for clustering spatial-temporal data[J].Data & Knowledge Engineering,2007,60 ( 1 ) : 208-221.
9George K,Han E H,Kumar V.CHAMELEON:a hierarchical clustering algorithm using dynamic modeling[J].IEEE Computer,1999, 27(3 ) : 329-341.

二级参考文献8

1Han JW, Kambr M. Data Mining Concepts and Techniques. Beijing: Higher Education Press, 2001. 145-176.
2Kaufan L, Rousseeuw PJ. Finding Groups in Data: an Introduction to Cluster Analysis. New York: John Wiley & Sons, 1990.
3Ester M, Kriegel HP, Sander J, Xu X. A density based algorithm for discovering clusters in large spatial databases with noise. In:Simoudis E, Han JW, Fayyad UM, eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.Portland: AAAI Press, 1996. 226-231.
4Guha S, Rastogi R, Shim K. CURE: an efficient clustering algorithm for large databases. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. "73-84.
5Agrawal R, Gehrke J, Gunopolos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining application. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data.Seattle: ACM Press, 1998.94-105.
6Alexandros N, Yannis T,Yannis M. C^2P: clustering based on closest pairs. In: Apers PMG, Atzeni P, Ceri S, Paraboschi S,Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma:Morgan Kaufmann Publishers, 2001. 331-340.
7Berchtold S, Bohm C, Kriegel H-P. The pyramid-technique: towards breaking the curse of dimensionality. In: Haas LM, Tiwary A,eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. 142- 153.
8Yu C, Ooi BC, Tan K-L, Jagadish HV. Indexing the distance: an efficient method to KNN processing. In: Apers PMG, Atzeni P,Ceri S, Paraboschi S, Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma: Morgan Kaufmann Publishers, 2001. 421--430.

共引文献107

1李玉鑑.自适应K-均值聚类算法[J].计算机研究与发展,2007,44(z2):100-104. 被引量：5
2薛永生,翁伟,文娟,王劲波,张宇.LSNCCP——一种基于最大不相含核心点集的聚类算法[J].计算机研究与发展,2004,41(11):1930-1935. 被引量：2
3陈燕,耿国华,郑建国.一种改进的基于密度的聚类算法[J].微机发展,2005,15(3):17-19. 被引量：13
4王恬宇.基于空间聚类的图像检索方法[J].情报杂志,2005,24(4):108-109.
5董子祥,赵阔.解决大学生心理问题的重要手段——网络化教育[J].社会科学论坛（学术研究卷）,2005(4):100-101.
6石陆魁,何丕廉.一种基于密度的高效聚类算法[J].计算机应用,2005,25(8):1824-1826. 被引量：21
7文登敏,张丽梅.基于对象“形状”的聚类算法[J].计算机应用与软件,2005,22(12):121-123.
8陈卓,孟庆春,魏振钢,任丽婕,窦金凤.一种基于网格和密度凝聚点的快速聚类算法[J].哈尔滨工业大学学报,2005,37(12):1654-1657. 被引量：14
9李锁花,孙志挥,周晓云.基于特征向量的分布式聚类算法[J].计算机应用,2006,26(2):379-382. 被引量：6
10王伦文.聚类的粒度分析[J].计算机工程与应用,2006,42(5):29-31. 被引量：19

同被引文献22

1刘小芳,曾黄麟,吕炳朝.点密度函数加权模糊C-均值算法的聚类分析[J].计算机工程与应用,2004,40(24):64-65. 被引量：28
2石陆魁,何丕廉.一种基于密度的高效聚类算法[J].计算机应用,2005,25(8):1824-1826. 被引量：21
3Altinel M, Franklin M. Efficicient filtering of XML docu- ments for selective dissemination of information[A]. Proceedings of VLDB[C]. Cairo, Egypt, 2000 : 53 - 64.
4Bezdek J C, Hathaway R,Sabiu M, et al. Convergence theory for fuzzy C-means-counterexample and repairs. IEEE Transactions on Systems, Man, and Cybernetics, 1987, 17(5): 873-877.
5Hall L O,Goldgof D B.Convergence of the single-pass and online fuzzy C-means algorithms. IEEE Transactions on Fuzzy Systems, 2011, 19(4): 792-794.
6Zhu L,Chtmg F L,Wang S T.Generalized fuzzy C-means clustering algorithm with improved fuzzy partitions. IEEE Transactions on Systems, Man, and Cybernetics, 2009, 39(3): 578-591.
7Frigui H, Krishnapuram R. Clustering by competitive agglomeration. Pattern Recognition, 1997, 30(7): 1109-1119.
8Boujemaa N. Generalized competitive clustering for image segmentation. In: Proceedings of the 19th International Conference of the North American Fuzzy Information Processing Society - NAFIPS. IEEE, 2000: 133-137.
9Tang C L,Wang S G,Xu W.New fuzzy C-means clustering model based on the data weighted approach. Data & Knowledge Engineering, 2010, 69(9): 887-900.
10Endo Y, Hamasuna Y, Yamashiro M, et al. On semi-supervised fuzzy C-means clustering. In: Proceedings of IEEE International Conference on Fuzzy Systems, Korea: FUZZ-IEEE, 2009: 1119-1124.

引证文献2

1张荣,杨秋,何佃伟,吴宏超.一种基于近类点和模糊点的未知雷达信号分选算法[J].舰船电子对抗,2011,34(5):12-14. 被引量：3
2于平,王士同.基于点密度的半监督CA算法在图像聚类中的应用[J].南京大学学报（自然科学版）,2014,50(4):447-456. 被引量：2

二级引证文献5

1于振洋,高尚兵,唐嵩涛.应用局部结构与方向张量的图像分割算法研究[J].南京大学学报（自然科学版）,2015,51(1):111-117. 被引量：1
2周啸,史瑞芝,李少梅,巩现勇,管凌霄.高阶网点集聚型重心定位频率调制半色调算法[J].南京大学学报（自然科学版）,2015,51(4):866-879.
3冉小辉,朱卫纲.基于加权局部与全局一致性的雷达辐射源识别[J].电子测量技术,2019,42(13):112-116.
4吴连慧,周秀珍,宋新超.基于改进OPTICS聚类的雷达信号预分选方法[J].舰船电子对抗,2018,41(6):95-99. 被引量：5
5彭刚,袁晓,刘闻.雷达辐射源信号聚类分选算法综述[J].雷达科学与技术,2019,17(5):485-492. 被引量：16

1刘合兵,尚俊平.基于距离和密度的聚类和孤立点检测算法[J].河南师范大学学报（自然科学版）,2008,36(3):38-40. 被引量：3
2申锐.数据挖掘技术中聚类算法的探索与研究[J].山西科技,2009,24(2):90-91. 被引量：2
3王赛芳,戴芳,王万斌,张晓宇.基于初始聚类中心优化的K-均值算法[J].计算机工程与科学,2010,32(10):105-107. 被引量：24
4陈燕,耿国华,郑建国.一种改进的基于密度的聚类算法[J].微机发展,2005,15(3):17-19. 被引量：13
5刘青宝,邓苏,张维明.基于相对密度的聚类算法[J].计算机科学,2007,34(2):192-195. 被引量：13
6刘青宝,邓苏,张维明.基于相对密度的聚类算法[J].科学技术与工程,2006,6(15):2272-2276. 被引量：1
7李南,钟一文.多代表点的数据流分类算法[J].小型微型计算机系统,2015,36(7):1535-1539. 被引量：2
8史勇.计算机网络安全与防火墙技术[J].黑龙江科技信息,2013(21):141-141.
9冯永,韩楠,贾东风.云计算环境下基于代表点增量层次密度聚类的微博事件检测及跟踪[J].计算机应用,2013,33(12):3559-3562. 被引量：3
10武静峰.任意形状表格及框图的快速设计[J].中国计算机用户,1991(1):14-16.

计算机工程与应用

2008年第28期

浏览历史

内容加载中请稍等...

一种基于代表点和点密度的聚类算法被引量：2

参考文献9

二级参考文献8

共引文献107

同被引文献22

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

一种基于代表点和点密度的聚类算法 被引量：2

参考文献9

二级参考文献8

共引文献107

同被引文献22

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

一种基于代表点和点密度的聚类算法被引量：2