摘要
这是一个新的基于网格的聚类算法.通过逐级二分每个网格成为等体积的两部分,算法使用新的标准度量所有格之间的不相似性,并借此找到数据集中聚类原型的候选,能够克服目前基于网格聚类算法的聚类结果对输入参数敏感的缺点,并且以线性的计算时间耗费,在包含任意形状和密度分布不均匀类的数据集中运行得很好.通过两个实验验证了所提出算法的有效性.
A new grid-based clustering approach is presented. By hierarchically bisecting each grid into two volume-equal new grids, this approach can use a new criterion to measure the dissimilarity among all grids and find these candidates of all prototypes. Therefore, the proposed approach can overcome the parametersensitive defects in most conventional grid-based clustering approaches, and work well in such dataset with arbitrary-shaped and density-skewed clusters at linear computational complexity. Two experiments are used to verify its clustering effectiveness.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2005年第9期1505-1510,共6页
Journal of Computer Research and Development
基金
江西省教育厅科学技术研究基金项目(赣教技字[2005]118号)~~
关键词
二分法
聚类分析
高维数据
有效性
bisection
cluster analysis
high-dimensional data
validity