基于度量参数自学习的半监督密度聚类方法

Semi-supervised density-based clustering method based on self-learning of metric parameters

下载PDF

导出

摘要针对密度聚类算法(DBSCAN)难以体现各维度对聚类的差异化贡献,且算法准确性依赖人工设置距离阈值等问题,文中提出基于度量参数自学习的半监督DBSCAN,即SMP-SDBSCAN。设计基于logistic回归模型的距离参数训练方法,利用少量的标记数据训练各维度的聚类贡献权重;构建数据聚簇参数计算机制,将标记数据聚簇的平均类间距离和邻域密度设置为聚类参数,提升密度聚类算法对数据集的适应性。实验表明,所提方法能够选择合理的聚类参数,可有效提升密度聚类算法聚类精度。 In order to address the issue that the Density-Based Spatial Clustering of Application with Noise(DBSCAN)fails to reflect the differentiated contributions of each dimension to the clustering,and the accuracy of the algorithm depends on the manual setting of distance threshold parameters,a semi-supervised DBSCAN clustering algorithm called SMP-SDBSCAN is proposed,which is based on the self-learning of metric parameters.A distance parameter training method based on the logistic regression model is designed to train the clustering contribution weights of each dimension using a small amount of labeled data.A mechanism for calculating the cluster parameters of data clusters is constructed,where the average inter-cluster distance and neighborhood density of the labeled data clusters are calculated as the clustering parameters,thereby improving the adaptability of the density clustering algorithm to the data set.Experiment results show that the proposed method can select reasonable clustering parameters and effectively improve the clustering accuracy of the density-based clustering algorithm.

作者袁国泉赵新建张颂陈石徐晨维 YUAN Guo-quan;ZHAO Xin-jian;ZHANG Song;CHEN Shi;XU Chen-wei(Information&Telecommunication Branch State Grid Jiangsu Electric Power Co.,Ltd.,Nanjing 210000,China)

机构地区国网江苏省电力有限公司信息通信分公司

出处《信息技术》 2024年第11期77-83,91,共8页 Information Technology

基金国网江苏省电力有限公司科技项目(J2022109)。

关键词密度聚类距离度量 LOGISTIC回归半监督学习自学习 density clustering distance measurement logistic regression semi-supervised learning self-learning

分类号 TP311.1 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

1焦昊,陈烨,李元,陈锦铭,赵新冬.基于拓扑相似度分析和密度聚类的含新能源配网分群[J].能源与环保,2024,46(10):198-205.
2谷瑞,宋翠玲,李元昊.改进 Res2Net和注意力的中药饮片识别模型[J].国外电子测量技术,2024,43(9):130-140.
3杜明晶,吴福玉,李宇蕊,董永权.侵蚀聚类[J].电子学报,2024,52(10):3459-3471.
4翟永杰,刘璇,王新颖,王乾铭,刘金龙.基于全局与局部注意力的车辆方位场景识别[J].电子测量技术,2024,47(14):96-107.
5谢济铭,夏玉兰,秦雅琴,赵荣达,刘兵,段国忠,陈金宏.基于双向长短期记忆网络的城市快速路合流区车速预测[J].西南交通大学学报,2024,59(5):1235-1244.
6赵嘉,马清,陈蔚昌,肖人彬,崔志华,潘正祥.面向流形数据的加权自然近邻密度峰值聚类算法[J].兰州大学学报（自然科学版）,2024,60(5):652-660.
7齐登辉,张得龙.超越视觉限制:失象症的跨学科探索[J].心理科学进展,2024,32(11):1844-1853.
8李开明,王欢,解岩,陈卓,高泽岳.基于自适应参数估计的微动时频表征重构方法[J].空军工程大学学报,2024,25(5):107-114.
9陈孝慈,谭章禄.基于改进递归区间2型直觉FNN的时间序列预测[J].统计与决策,2024,40(20):61-66.
10于朋朋,黄云峰,彭森,张万超,周悦,樊锦凯.基于深度聚类的供水管网分区构建方法研究[J].给水排水,2024,50(9):134-141.

信息技术

2024年第11期

浏览历史

内容加载中请稍等...

基于度量参数自学习的半监督密度聚类方法

相关作者

相关机构

相关主题

浏览历史