摘要
结合Chameleon算法可以发现高质量的任意形状、大小和密度的自然簇及一趟聚类算法快速高效的特点,研究可以处理混合属性的高效聚类算法.首先简单改进Chameleon算法,使之可以处理含分类属性的数据;进而提出一种两阶段聚类算法.第一阶段使用一趟聚类算法对数据集进行初始划分,第二阶段利用改进的Chameleon算法归并初始划分而得到最终聚类.在真实数据集和人造数据集上的实验结果表明,提出的两阶段聚类算法是有效可行的.
In view of the fact that Chameleon clustering algorithm can identify the data with arbitrary shape,size and density,and one-pass clustering algorithm has the efficient feature,an efficient clustering algorithm is presented,the clustering algorithm can process the data with categorical attributes.First,Chameleon is improved to process the data with categorical attributes.Second,by combining one-pass clustering algorithm with improved Chameleon clustering algorithm,a two-stage enhanced Chameleon clustering algorithm is presented.In the first stage,one-pass clustering algorithm is used for grouping the data(we call it original partition).In the second stage,we merge that partition with improved Chameleon clustering algorithm so that the final clusters are obtained.The experimental results on real datasets and synthetic datasets show that the presented clustering algorithm is effective and practicable.
出处
《小型微型计算机系统》
CSCD
北大核心
2010年第8期1643-1646,共4页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(60673191)资助
广东省自然科学基金项目(9151026005000002)资助
广东省高等学校自然科学研究重点项目(06Z012)资助
关键词
一趟聚类算法
基于图的聚类算法
任意形状簇
one-pass clustering algorithm
graph-based clustering algorithm
arbitrary shape cluster