摘要
针对传统方法缺乏对查询标准的全面考虑,并且反复运行半监督聚类来更新标签等问题,提出一种基于加权投票一致性的主动密度峰值数据聚类算法。提出一个同时考虑代表性和信息性的主动密度峰聚类算法,选取有代表性的实例捕捉数据模式,查询信息实例以减少聚类结果的不确定性。设计快速更新策略,有效地更新了标签。提出一个结合局部和全局不确定性的主动聚类集成框架,并引入加权投票一致性方法对聚类结果进行优化整合。在数据集上的实验结果证明了该方法能有效解决传统方法问题,并具有良好的聚类性能。
Aimed at the problem that traditional methods lack of comprehensive consideration of query criteria and run semi supervised clustering repeatedly to update tags,an active density peak data clustering algorithm based on weighted voting consistency is proposed.An active density peak clustering algorithm considering both representative and informative was proposed.Representative instances were selected to capture data patterns and query information instances to reduce the uncertainty of clustering results.A fast update strategy was designed to update the label effectively.An active clustering integration framework combining local and global uncertainties was proposed,and weighted voting consensus method was introduced to optimize the integration of clustering results.The experimental results on data sets show that the proposed method can effectively solve the problems of traditional methods,and has good clustering performance.
作者
张苏颖
许曙青
Zhang Suying;Xu Shuqing(Nanjing Engineering Branch of Jiangsu Union Technical Institute,Nanjing 210035,Jiangsu,China;School of Computer Science,Nanjing University of Technology,Nanjing 210000,Jiangsu,China)
出处
《计算机应用与软件》
北大核心
2023年第10期297-306,共10页
Computer Applications and Software
基金
2016年度江苏省第五期“333工程”科研资助立项项目(BRA2016432)。
关键词
半监督
主动峰值聚类
加权投票
查询选择
Semi supervision
Active peak clustering
Weighted voting
Query selection