期刊文献+

一种深度自监督聚类集成算法 被引量:6

A deep self-supervised clustering ensemble algorithm
下载PDF
导出
摘要 针对聚类集成中一致性函数设计问题,本文提出一种深度自监督聚类集成算法。该算法首先根据基聚类划分结果采用加权连通三元组算法计算样本之间的相似度矩阵,基于相似度矩阵表达邻接关系,将基聚类由特征空间中的数据表示变换至图数据表示;在此基础上,基聚类的一致性集成问题被转化为对基聚类图数据表示的图聚类问题。为此,本文利用图神经网络构造自监督聚类集成模型,一方面采用图自动编码器学习图的低维嵌入,依据低维嵌入似然分布估计聚类集成的目标分布;另一方面利用聚类集成目标对低维嵌入过程进行指导,确保模型获得的图低维嵌入与聚类集成结果是一致最优的。在大量数据集上进行了仿真实验,结果表明本文算法相比HGPA、CSPA和MCLA等算法可以进一步提高聚类集成结果的准确性。 In this study,we propose a deep self-supervised clustering ensemble algorithm to obtain the design of a consensus function in a clustering ensemble.In this algorithm,a weighted connected-triple algorithm is applied to the cluster components for estimating the similarity matrix of the samples,based on which the adjacency relation can be determined.Thus,the cluster components can be transformed from data representation in the feature space to graph data representation.On this basis,the consistency integration problem of cluster components is transformed into a graph clustering problem for the graph data representation of cluster components.Further,a graph neural network is used to construct the self-supervised clustering ensemble model.This model uses a graph autoencoder to obtain the low-dimensional embedding of the graph,and the target distribution of the cluster ensemble can be estimated based on the likelihood distribution generated via low-dimensional embedding.The clustering ensemble guides the learning of low-dimensional embedding.The above methods ensure that the low-dimensional embedding and clustering ensemble results obtained by the model are consistent and optimal.Simulation experiments were conducted on a large number of data sets.Results show that the proposed algorithm improves the accuracy of the clustering ensemble result compared with the accuracies obtained using algorithms such as HGPA,CSPA,and MCLA.
作者 杜航原 张晶 王文剑 DU Hangyuan;ZHANG Jing;WANG Wenjian(College of Computer and Information Technology,Shanxi University,Taiyuan 030006,China;Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan 030006,China)
出处 《智能系统学报》 CSCD 北大核心 2020年第6期1113-1120,共8页 CAAI Transactions on Intelligent Systems
基金 国家自然科学基金项目(61902227,61673249,61773247,U1805263) 山西省国际合作重点研发计划项目(201903D421050) 山西省基础研究计划项目(201901D211192) 山西省应用基础研究计划项目(201701D121053) 山西省1331工程项目.
关键词 特征空间 聚类算法 一致性函数 图表示 相似性度量 自监督学习 图数据 神经网络模型 feature space clustering algorithm consistency function graph representation similarity measure self-supervised learning graphical data neural network model
  • 相关文献

参考文献2

二级参考文献32

  • 1唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量:95
  • 2阳琳贇,王文渊.聚类融合方法综述[J].计算机应用研究,2005,22(12):8-10. 被引量:28
  • 3李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量:114
  • 4罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324. 被引量:36
  • 5Judd D,Mckinley P,Jain A K.Large-scale parallel data clustering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(8):871-876.
  • 6Bhatia S K,Deogun J S.Conceptual clustering information retrieval[J].IEEE Transactions on Systems,Man,and Cyberne-tics,1998,28(3):427-436.
  • 7Frigui H,Krishnapuram R.A robust competitiveclustering algorithm with applications in computer vision[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1999,21(5):450-465.
  • 8Jain A K,Murty M N,Flynn P J.Data clustering:A review[J].ACM Computing Surveys,1999,31(3):264 -323.
  • 9Wang Xi,Yang Chunyu,Zhou Jie.Clustering aggregation by probability accumulation[J].Pattern Recognition,2009,42(5):668-675.
  • 10Fraley C,Raftery A E.How many clusters?Which clustering method?Answers via model based cluster analysis[J].The Computer Journal,1998,41(8):578-588.

共引文献1082

同被引文献37

引证文献6

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部