困难样本采样联合对比增强的深度图聚类被引量：1

Deep graph clustering with hard sample sampling joint contrastive augmentation

下载PDF

导出

摘要针对困难样本挖掘的图聚类算法是最近的研究热点,目前算法存在的主要问题有:对比方法和样本对加权策略缺少良好的融合机制;采样正样本时忽略了视图内部的“假阴性”样本;忽视图级信息对聚类的帮助。针对上述问题,提出困难样本采样联合对比增强的图聚类算法。首先使用自编码器学习嵌入,根据计算的伪标签、相似度、置信度信息为表示学习设计一种自加权对比损失,统一不同视图下节点对比和困难样本对加权策略。通过调整不同置信区域样本对的权重,损失函数驱动模型关注不同类型的困难样本以学习有区分性的特征,提高簇内表示的一致性和簇间表示的差异性,增强对样本的判别能力。其次,图级表示经聚类网络投影,通过聚类对比损失最大化不同视图下聚类的表示一致性。最后联合两种对比损失,利用自监督训练机制进行迭代优化,完成聚类任务。该算法在5个真实数据集上与9个基准聚类算法对比,在4个权威指标上达到最优,聚类性能出色。消融实验表明两个对比模块的有效性和可迁移性。 The graph clustering algorithm for hard samples mining is a recent research hotspot.In the current algorithm,the main problems include the lack of a fusion mechanism for comparing methods and a sample pair weighting strategy;the algorithms ignore“false negative”samples within the view when sampling positive samples and disregarding the help of graph-level information for clustering.To address the issues above,this paper proposed a graph clustering algorithm based on hard sample sampling joint contrast augmentation.Initially,it utilized an autoencoder to learn embeddings,designed a self-weighted contrast loss for representation learning by utilizing the calculated pseudo-label,similarity,and confidence information,and unified the strategies of node comparison and hard sample pair weighting across different views.By adjusting the weights of sample pairs in different confidence regions,the loss function derived the model to focus on different types of hard samples to learn discriminative features,improving the consistency of intra-cluster representation and the distinctiveness of inter-cluster representation and enhancing the ability to discriminate samples.Additionally,the clustering network projected the graph-level representation to maximize the representation consistency of clusters under different views through cluster contrast loss.Finally,combining the two comparison losses,the selfsupervised training is used for iterative optimization to complete clustering.In the comparison with 9 benchmark algorithms on 5 real datasets,this algorithm achieves superior performance on 4 authoritative indicators,highlighting its excellent clustering capabilities.Ablation experiments demonstrate the effectiveness and transferability of the two contrasting modules.

作者朱玄烨孔兵陈红梅包崇明周丽华 Zhu Xuanye;Kong Bing;Chen Hongmei;Bao Chongming;Zhou Lihua(School of Information Science&Engineering,Yunnan University,Kunming 650504,China)

机构地区云南大学信息学院

出处《计算机应用研究》 CSCD 北大核心 2024年第6期1769-1777,共9页 Application Research of Computers

基金国家自然科学基金资助项目(62062066,61762090,61966036,62276227) 2022年云南省基础研究计划重点项目(202201AS070015) 云南省中青年学术和技术带头人后备人才资助项目(202205AC160033) 云南省智能系统与计算重点实验室资助项目(202205AG070003)。

关键词图表示学习属性图聚类对比学习困难样本挖掘 graph representation learning attributed graph clustering contrastive learning hard sample mining

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论] TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

同被引文献4

1陶全桧,安俊秀,戴宇睿,陈宏松,黄萍.基于多视角学习的时序多模态情感分类研究[J].计算机应用研究,2023,40(1):102-106. 被引量：3
2宋菲.基于聚类结构和局部相似性的多视图隐空间聚类[J].计算机应用研究,2023,40(9):2650-2656. 被引量：2
3赵伟豪,林浩申,曹传杰,杨晓君.基于相似图投影学习的多视图聚类[J].计算机应用研究,2024,41(1):102-107. 被引量：1
4赵振廷,赵旭俊.多样性约束和高阶信息挖掘的多视图聚类[J].计算机应用研究,2024,41(8):2309-2314. 被引量：1

引证文献1

1柳源,安俊秀,杨林旺.多角度语义标签引导的自监督多视图聚类[J].计算机应用研究,2024,41(11):3357-3363.

1何惠梅.全程信息化管理在提高医院检验分析前阶段质量中的应用效果[J].中国卫生产业,2024,21(2):116-119.
2李源凡,张丽红.基于CLIP模型和文本重建的人脸图像生成方法研究[J].测试技术学报,2024,38(2):154-160.
3马千里,高梓惠,贾鹏,马佰钰,张铭真.考虑组测成本和时间价值的概率群试双目标优化模型研究[J].运筹与管理,2024,33(4):50-55.
4郁宇炯,周琦.古城文化资源转化的优化路径探析--以宁波实践为样本[J].改革与开放,2024(4):7-14.
5姜建国,胡钧凯,王秀芳.基于VisuShrink小波阈值变换的甲烷检测降噪技术[J].化工自动化及仪表,2024,51(3):410-416.

计算机应用研究

2024年第6期

浏览历史

内容加载中请稍等...

困难样本采样联合对比增强的深度图聚类被引量：1

同被引文献4

引证文献1

相关作者

相关机构

相关主题

浏览历史

困难样本采样联合对比增强的深度图聚类 被引量：1

同被引文献4

引证文献1

相关作者

相关机构

相关主题

浏览历史

困难样本采样联合对比增强的深度图聚类被引量：1