摘要
针对半监督聚类算法性能受到成对约束数量多寡的限制问题,现有的研究大都依赖于原始成对约束的数量。因此,首先提出了基于灰关联分析的成对约束初始化算法(initialization algorithm of pair constraints based on grey relational analysis,PCIG)。该算法通过均衡接近度计算数据对象间的相似度,并根据相似度的取值来确定可信区间,然后借鉴网络结构初始化方法来扩充数据对象间的成对关系。最后,将其应用于标签传播聚类算法。通过在五个基准数据集上进行实验,基于改进成对约束扩充的标签传播聚类算法与其他方法相比NMI值和ARI值有所提升。实验结果证明了改进成对约束扩充可以有效改善标签传播算法的聚类效果。
To address the problem that the performance of semi-supervised clustering algorithms is limited by the number of pairwise constraints,most of the existing research relies on the number of original pairwise constraints,this paper firstly proposed the PCIG.The algorithm calculated the similarity between data objects by degree of balance and approach,and determined the confidence interval based on the value of similarity,then expanded the pairwise relationships between data objects by drawing on the initialization method of network structure.Finally,it was applied to the label propagation clustering algorithm.Through the experiments on 5 benchmark datasets,the label propagation clustering algorithm based on improved pairwise constraint expansion has improved NMI and ARI values compared with other methods.Experimental results show that improving pairwise constraint expansion can effectively improve the clustering effect of label propagation algorithm.
作者
吴颖豪
刘虹
张岐山
Wu Yinghao;Liu Hong;Zhang Qishan(College of Economics&Management,Fuzhou University,Fuzhou 350000,China)
出处
《计算机应用研究》
CSCD
北大核心
2022年第12期3592-3597,共6页
Application Research of Computers
关键词
半监督聚类
成对约束
标签传播
灰关联分析
semi-supervised clustering
pairwise constraints
label propagation
grey relational analysis