LSNCCP——一种基于最大不相含核心点集的聚类算法被引量：2

LSNCCP:A Clustering Algorithm Based on the Largest Set of Not-Covered Core Points

下载PDF

导出

摘要聚类在数据挖掘、模式识别等许多领域有着重要的应用提出了一种新颖的聚类算法 :一种基于最大不相含核心点集的聚类算法LSNCCP(aclusteringalgorithmbasedonthelargestsetofnot coveredcorepoints) 在密度定义的基础上 ,考察核心点之间的距离关系 ,定义相含、相交、相离这 3种核心点之间的关系 ,最后找出一个最大不相含核心点集 ,在此基础上进行聚类 ,并且找到解决丢失点问题的快速方法该最大不相含核心点集只是全部核心点集合的一个很小的子集 ,因此有效地缩减了同类算法中搜寻核心点的时间 Clustering is an important application area for many fields including data mining, pattern recognition, etc. In this paper, a novel clustering algorithm LSNCCP(a clustering algorithm based on the largest set of not-covered core points) is proposed. On the basis of the definition of density, the distance between the core points is discussed. And then, the three essential distance relation: covered core points, intersectant core points, and separate core points. Finally, the largest set of not-covered core points is found and based on the set the data can cluster very well. Because the largest set of not-covered core points is a lesser subset of the all core points, the new algorithm cuts short the time of searching all core points in the similar algorithms. The feasibility and the advantage or the new algorithm are proved in theory and experiment.

作者薛永生翁伟文娟王劲波张宇

机构地区厦门大学计算机科学系厦门大学经济学院

出处《计算机研究与发展》 EI CSCD 北大核心 2004年第11期1930-1935,共6页 Journal of Computer Research and Development

基金福建省自然科学基金项目 (A0 3 10 0 0 8) 福建省高新技术研究开放计划重点项目 (2 0 0 3H0 43 )

关键词数据挖掘聚类密度核心点最大不相含核心点集 data mining clustering density core points largest set of not-covered core points

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献8

1M Fayyad, G Piatetsky-Shapiro, P Smyth. From data mining to knowledge discovery: Anoverview. In: M Fayyad, G Piatetsky-Shapiro, P Smyth, eds. Advances in Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press, 1996. 1～36
2M Ester, HP Kriegel, J Sander, et al. A density based algorithm for discovering clusters in large spatial databases with noise. In: E Simoudis, JW Han, UM Fayyad eds. Proc of the 2nd Int'l Conf on Knowledge Discovery and Data Mining. Portland: AAAI Press, 1996. 226～231
3S Guha, R Rastogi, K Shim. CURE: An efficient clustering algorithm for large databases. In: LM Haas, A Tiwary eds. Proc of the ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1998. 73～84
4R Agrawal, J Gehrke, D Gunopolos, et al. Automatic subspace clustering of high dimensional data for data mining application. In: LM Haas, A Tiwary eds. Proc of the ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1998. 94～105
5周水庚,周傲英,曹晶,胡运发.一种基于密度的快速聚类算法[J].计算机研究与发展,2000,37(11):1287-1292. 被引量：88
6马帅,王腾蛟,唐世渭,杨冬青,高军.一种基于参考点和密度的快速聚类算法[J].软件学报,2003,14(6):1089-1095. 被引量：108
7S Berchtold, C Bohm, H-P Kriegel. The pyramid-technique: Towards breaking the curse of dimensionality. In: LM Haas, A Tiwary eds. Proc of the ACM SIGMOD Int'l Conf Management of Data. New York: ACM Press, 1988. 142～153
8C Yu, BC Ooi, K-L Tan, et al. Indexing the distance: An efficient method to KNN processing. In: PMG Apers, P Atzeni, S Ceri, et al eds. Proc of the 27th Int'l Conf on Very Large Data Bases. San Francisco, CA: Morgan Kaufmann, 2001. 421～430

二级参考文献12

1Han JW, Kambr M. Data Mining Concepts and Techniques. Beijing: Higher Education Press, 2001. 145-176.
2Kaufan L, Rousseeuw PJ. Finding Groups in Data: an Introduction to Cluster Analysis. New York: John Wiley & Sons, 1990.
3Ester M, Kriegel HP, Sander J, Xu X. A density based algorithm for discovering clusters in large spatial databases with noise. In:Simoudis E, Han JW, Fayyad UM, eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.Portland: AAAI Press, 1996. 226-231.
4Guha S, Rastogi R, Shim K. CURE: an efficient clustering algorithm for large databases. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. "73-84.
5Agrawal R, Gehrke J, Gunopolos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining application. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data.Seattle: ACM Press, 1998.94-105.
6Alexandros N, Yannis T,Yannis M. C^2P: clustering based on closest pairs. In: Apers PMG, Atzeni P, Ceri S, Paraboschi S,Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma:Morgan Kaufmann Publishers, 2001. 331-340.
7Berchtold S, Bohm C, Kriegel H-P. The pyramid-technique: towards breaking the curse of dimensionality. In: Haas LM, Tiwary A,eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. 142- 153.
8Yu C, Ooi BC, Tan K-L, Jagadish HV. Indexing the distance: an efficient method to KNN processing. In: Apers PMG, Atzeni P,Ceri S, Paraboschi S, Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma: Morgan Kaufmann Publishers, 2001. 421--430.
9Zhang W，Proc 23rd VL DB Conf，1997年，186页
10Chen M S，IEEE Trans Knowledge Data Engineering，1996年，8卷，6期，866页

共引文献190

1梁敏君,倪志伟,倪丽萍,杨葛钟啸.基于网格与分形维数的聚类算法[J].计算机应用,2009,29(3):830-832. 被引量：4
2周文勇.改进的K-均值聚类算法[J].光盘技术,2007(2):54-56. 被引量：6
3李玉鑑.自适应K-均值聚类算法[J].计算机研究与发展,2007,44(z2):100-104. 被引量：5
4王海,王忠民.一种基于密度和网格的聚类算法在KDD中的应用[J].计算机工程与应用,2004,40(24):180-182. 被引量：3
5周永权,焦李成.高属性维稀疏数据聚类回归逻辑神经网络模型及学习算法[J].电子学报,2004,32(8):1342-1345. 被引量：3
6林小红,蒋伟进.基于RS近似算法的医疗诊断知识挖掘[J].海军工程大学学报,2004,16(5):48-51.
7郭伟,唐晓君,刘万军.一种基于划分的聚类算法分析与改进[J].辽宁工程技术大学学报（自然科学版）,2004,23(6):826-828. 被引量：4
8陈燕,耿国华,郑建国.一种改进的基于密度的聚类算法[J].微机发展,2005,15(3):17-19. 被引量：13
9万志华,欧阳为民,张平庸.一种基于划分的动态聚类算法[J].计算机工程与设计,2005,26(1):177-179. 被引量：16
10王恬宇.基于空间聚类的图像检索方法[J].情报杂志,2005,24(4):108-109.

同被引文献11

1谢昆青,马修军,杨冬青,等(译).空间数据库[M].北京:机械工业出版社,2004.140-141.
2Ester M,Kriegel H P,Sander J.Algorithms and application for spatial data mining,geographic data mining and knowledge discovery.Research Monographs in GIS,Taylor and Francis,2001.
3Han Jia-wei,Micheline Kamber.Data mining concepts and technique[M].2nd ed.[S.l.]:China Machine Press,2006:383-386.
4Ester M,Kriegel H P,Sander J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proc of 2nd KDD Portland,1996:226-231.
5Wang Xin,Hamilton H J.DBRS:a density-based spatial clustering method with random sampling[C]//Proc of the 7th PAKDD,Seoul,Korea,2003:563-575.
6Dai Wei-di,Hou Yue-xian,He Pi-lian,et al.A clustering algorithm based on building a density-tree[C]//Proc of the 4th Int Conf on Machine Learning and Cybernetics,Guangzhou,2005 OR IEEE,2005:1999-2004.
7Katayama N,Satoh S.The SR-tree:an index structure for high-dimensional nearest neighbor queries[J].SIGMOD Record,1997,26(2):369-380.
8Kroger P,Kriegel H-P,Kailing K.Density-connected subspace clustering for high-dimensional data[C]//Proc SIAM Int Conf on Data Mining(SDM'04),Lake Buena Vista,FL,2004:246-257.
9周水庚,周傲英,曹晶,胡运发.一种基于密度的快速聚类算法[J].计算机研究与发展,2000,37(11):1287-1292. 被引量：88
10陈宁,陈安,周龙骧.基于密度的增量式网格聚类算法(英文)[J].软件学报,2002,13(1):1-7. 被引量：44

引证文献2

1王劲波,翁伟,许华荣.数据挖掘中基于密度的聚类分析算法[J].统计与决策,2005,21(10X):139-141. 被引量：2
2陶亮,倪志伟,刘晓.DBSB:启发式选择边界对象的快速空间聚类算法[J].计算机工程与应用,2007,43(11):164-167.

二级引证文献2

1叶菲,罗景青.基于搜索机制密度聚类的支持向量预选取算法[J].计算机工程,2008,34(19):206-208.
2杨帆,李显忠,潘可佳,龚艳,曾愚,刘捷,赵以兵.基于大数据分析的供电营业厅运营效率评估[J].电力信息与通信技术,2017,15(2):8-13. 被引量：13

1杨艺,马儒宁.基于核心点的大数据谱聚类算法[J].中国科学技术大学学报,2016,46(9):757-763. 被引量：5
2魏立梅,谢维信.聚类分析中竞争学习的一种新算法[J].电子科学学刊,2000,22(1):13-18. 被引量：5
3谢娟英,高红超,谢维信.K近邻优化的密度峰值快速搜索聚类算法[J].中国科学：信息科学,2016,46(2):258-280. 被引量：101
4马春连,许峰.基于聚集密度的人工免疫多目标进化算法[J].软件导刊,2013,12(12):67-70.
5杨虎,许峰.基于聚集密度的粒子群多目标优化算法[J].计算机工程与应用,2013,49(17):190-194. 被引量：6
6谭征,祝晓凤.一种密度聚类的染色精子医学图像识别算法[J].烟台大学学报（自然科学与工程版）,2014,27(4):279-283. 被引量：1
7吴健,崔志明,时玉杰,盛胜利,龚声蓉.基于局部密度构造相似矩阵的谱聚类算法[J].通信学报,2013,34(3):14-22. 被引量：14
8刘淳安.一种求解多目标优化问题的新遗传算法[J].宝鸡文理学院学报（自然科学版）,2004,24(2):92-94. 被引量：2
9刘淳安,王宇平.基于新模型的多目标遗传算法[J].西安电子科技大学学报,2005,32(2):260-263. 被引量：14
10王莉,苏卫华,余雪丽.一种基于点的多社区谱分解方法[J].计算机工程与科学,2009,31(9):8-10. 被引量：3

计算机研究与发展

2004年第11期

浏览历史

内容加载中请稍等...

LSNCCP——一种基于最大不相含核心点集的聚类算法被引量：2

参考文献8

二级参考文献12

共引文献190

同被引文献11

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

LSNCCP——一种基于最大不相含核心点集的聚类算法 被引量：2

参考文献8

二级参考文献12

共引文献190

同被引文献11

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

LSNCCP——一种基于最大不相含核心点集的聚类算法被引量：2