摘要
聚类在数据挖掘、模式识别等许多领域有着重要的应用 提出了一种新颖的聚类算法 :一种基于最大不相含核心点集的聚类算法LSNCCP(aclusteringalgorithmbasedonthelargestsetofnot coveredcorepoints) 在密度定义的基础上 ,考察核心点之间的距离关系 ,定义相含、相交、相离这 3种核心点之间的关系 ,最后找出一个最大不相含核心点集 ,在此基础上进行聚类 ,并且找到解决丢失点问题的快速方法 该最大不相含核心点集只是全部核心点集合的一个很小的子集 ,因此有效地缩减了同类算法中搜寻核心点的时间
Clustering is an important application area for many fields including data mining, pattern recognition, etc. In this paper, a novel clustering algorithm LSNCCP(a clustering algorithm based on the largest set of not-covered core points) is proposed. On the basis of the definition of density, the distance between the core points is discussed. And then, the three essential distance relation: covered core points, intersectant core points, and separate core points. Finally, the largest set of not-covered core points is found and based on the set the data can cluster very well. Because the largest set of not-covered core points is a lesser subset of the all core points, the new algorithm cuts short the time of searching all core points in the similar algorithms. The feasibility and the advantage or the new algorithm are proved in theory and experiment.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2004年第11期1930-1935,共6页
Journal of Computer Research and Development
基金
福建省自然科学基金项目 (A0 3 10 0 0 8)
福建省高新技术研究开放计划重点项目 (2 0 0 3H0 43 )
关键词
数据挖掘
聚类
密度
核心点
最大不相含核心点集
data mining
clustering
density
core points
largest set of not-covered core points