摘要
分析了CCL算法,基于数据对象间的关联限制定义了类间关联系数,提出了一种混合型限制层次聚类算法HCCL.本算法可以分成2个相对独立的阶段,第一阶段同Complete-link算法几乎一致,依据数据对象的自然分布,把它们合并入一个个小类;在第二阶段,依据背景知识,基于类间关联系数来实现小类的进一步合并,近邻信息辅助决策.实验结果表明,HCCL较CCL更为稳定,总体上能更有效地利用所给的关联限制.
The CCL algorithm is analyzed. Based on the instance-level constraint, the class-level constraint coefficient (CCC) is defined. And a hybrid constrained hierarchical algorithm HCCL is presented. HCCL is a two-stage algorithm. During the first stage, different classes will be merged according to data objects' natural distribution, and neighboring info is used to assist the decision. During the next stage, classes will be merged based on the CCC. Experiments on real-world datasets demonstrate that HCCL is more stable than CCL, and can utilize constraints more effectively.
出处
《福州大学学报(自然科学版)》
CAS
CSCD
北大核心
2005年第5期574-579,共6页
Journal of Fuzhou University(Natural Science Edition)
关键词
算法
聚类分析
关联
限制
algorithm
clustering analysis
link
constraint