期刊文献+

LSNCCP——一种基于最大不相含核心点集的聚类算法 被引量:2

LSNCCP:A Clustering Algorithm Based on the Largest Set of Not-Covered Core Points
下载PDF
导出
摘要 聚类在数据挖掘、模式识别等许多领域有着重要的应用 提出了一种新颖的聚类算法 :一种基于最大不相含核心点集的聚类算法LSNCCP(aclusteringalgorithmbasedonthelargestsetofnot coveredcorepoints) 在密度定义的基础上 ,考察核心点之间的距离关系 ,定义相含、相交、相离这 3种核心点之间的关系 ,最后找出一个最大不相含核心点集 ,在此基础上进行聚类 ,并且找到解决丢失点问题的快速方法 该最大不相含核心点集只是全部核心点集合的一个很小的子集 ,因此有效地缩减了同类算法中搜寻核心点的时间 Clustering is an important application area for many fields including data mining, pattern recognition, etc. In this paper, a novel clustering algorithm LSNCCP(a clustering algorithm based on the largest set of not-covered core points) is proposed. On the basis of the definition of density, the distance between the core points is discussed. And then, the three essential distance relation: covered core points, intersectant core points, and separate core points. Finally, the largest set of not-covered core points is found and based on the set the data can cluster very well. Because the largest set of not-covered core points is a lesser subset of the all core points, the new algorithm cuts short the time of searching all core points in the similar algorithms. The feasibility and the advantage or the new algorithm are proved in theory and experiment.
出处 《计算机研究与发展》 EI CSCD 北大核心 2004年第11期1930-1935,共6页 Journal of Computer Research and Development
基金 福建省自然科学基金项目 (A0 3 10 0 0 8) 福建省高新技术研究开放计划重点项目 (2 0 0 3H0 43 )
关键词 数据挖掘 聚类 密度 核心点 最大不相含核心点集 data mining clustering density core points largest set of not-covered core points
  • 相关文献

参考文献8

  • 1M Fayyad, G Piatetsky-Shapiro, P Smyth. From data mining to knowledge discovery: Anoverview. In: M Fayyad, G Piatetsky-Shapiro, P Smyth, eds. Advances in Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press, 1996. 1~36
  • 2M Ester, HP Kriegel, J Sander, et al. A density based algorithm for discovering clusters in large spatial databases with noise. In: E Simoudis, JW Han, UM Fayyad eds. Proc of the 2nd Int'l Conf on Knowledge Discovery and Data Mining. Portland: AAAI Press, 1996. 226~231
  • 3S Guha, R Rastogi, K Shim. CURE: An efficient clustering algorithm for large databases. In: LM Haas, A Tiwary eds. Proc of the ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1998. 73~84
  • 4R Agrawal, J Gehrke, D Gunopolos, et al. Automatic subspace clustering of high dimensional data for data mining application. In: LM Haas, A Tiwary eds. Proc of the ACM SIGMOD Int'l Conf on Management of Data. New York: ACM Press, 1998. 94~105
  • 5周水庚,周傲英,曹晶,胡运发.一种基于密度的快速聚类算法[J].计算机研究与发展,2000,37(11):1287-1292. 被引量:88
  • 6马帅,王腾蛟,唐世渭,杨冬青,高军.一种基于参考点和密度的快速聚类算法[J].软件学报,2003,14(6):1089-1095. 被引量:108
  • 7S Berchtold, C Bohm, H-P Kriegel. The pyramid-technique: Towards breaking the curse of dimensionality. In: LM Haas, A Tiwary eds. Proc of the ACM SIGMOD Int'l Conf Management of Data. New York: ACM Press, 1988. 142~153
  • 8C Yu, BC Ooi, K-L Tan, et al. Indexing the distance: An efficient method to KNN processing. In: PMG Apers, P Atzeni, S Ceri, et al eds. Proc of the 27th Int'l Conf on Very Large Data Bases. San Francisco, CA: Morgan Kaufmann, 2001. 421~430

二级参考文献12

  • 1Han JW, Kambr M. Data Mining Concepts and Techniques. Beijing: Higher Education Press, 2001. 145-176.
  • 2Kaufan L, Rousseeuw PJ. Finding Groups in Data: an Introduction to Cluster Analysis. New York: John Wiley & Sons, 1990.
  • 3Ester M, Kriegel HP, Sander J, Xu X. A density based algorithm for discovering clusters in large spatial databases with noise. In:Simoudis E, Han JW, Fayyad UM, eds. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.Portland: AAAI Press, 1996. 226-231.
  • 4Guha S, Rastogi R, Shim K. CURE: an efficient clustering algorithm for large databases. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. "73-84.
  • 5Agrawal R, Gehrke J, Gunopolos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining application. In: Haas LM, Tiwary A, eds. Proceedings of the ACM SIGMOD International Conference on Management of Data.Seattle: ACM Press, 1998.94-105.
  • 6Alexandros N, Yannis T,Yannis M. C^2P: clustering based on closest pairs. In: Apers PMG, Atzeni P, Ceri S, Paraboschi S,Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma:Morgan Kaufmann Publishers, 2001. 331-340.
  • 7Berchtold S, Bohm C, Kriegel H-P. The pyramid-technique: towards breaking the curse of dimensionality. In: Haas LM, Tiwary A,eds. Proceedings of the ACM SIGMOD International Conference on Management of Data. Seattle: ACM Press, 1998. 142- 153.
  • 8Yu C, Ooi BC, Tan K-L, Jagadish HV. Indexing the distance: an efficient method to KNN processing. In: Apers PMG, Atzeni P,Ceri S, Paraboschi S, Ramamohanarao K, Snodgrass RT, eds. Proceedings of the 27th International Conference on Very Large Data Bases. Roma: Morgan Kaufmann Publishers, 2001. 421--430.
  • 9Zhang W,Proc 23rd VL DB Conf,1997年,186页
  • 10Chen M S,IEEE Trans Knowledge Data Engineering,1996年,8卷,6期,866页

共引文献190

同被引文献11

  • 1谢昆青,马修军,杨冬青,等(译).空间数据库[M].北京:机械工业出版社,2004.140-141.
  • 2Ester M,Kriegel H P,Sander J.Algorithms and application for spatial data mining,geographic data mining and knowledge discovery.Research Monographs in GIS,Taylor and Francis,2001.
  • 3Han Jia-wei,Micheline Kamber.Data mining concepts and technique[M].2nd ed.[S.l.]:China Machine Press,2006:383-386.
  • 4Ester M,Kriegel H P,Sander J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proc of 2nd KDD Portland,1996:226-231.
  • 5Wang Xin,Hamilton H J.DBRS:a density-based spatial clustering method with random sampling[C]//Proc of the 7th PAKDD,Seoul,Korea,2003:563-575.
  • 6Dai Wei-di,Hou Yue-xian,He Pi-lian,et al.A clustering algorithm based on building a density-tree[C]//Proc of the 4th Int Conf on Machine Learning and Cybernetics,Guangzhou,2005 OR IEEE,2005:1999-2004.
  • 7Katayama N,Satoh S.The SR-tree:an index structure for high-dimensional nearest neighbor queries[J].SIGMOD Record,1997,26(2):369-380.
  • 8Kroger P,Kriegel H-P,Kailing K.Density-connected subspace clustering for high-dimensional data[C]//Proc SIAM Int Conf on Data Mining(SDM'04),Lake Buena Vista,FL,2004:246-257.
  • 9周水庚,周傲英,曹晶,胡运发.一种基于密度的快速聚类算法[J].计算机研究与发展,2000,37(11):1287-1292. 被引量:88
  • 10陈宁,陈安,周龙骧.基于密度的增量式网格聚类算法(英文)[J].软件学报,2002,13(1):1-7. 被引量:44

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部