期刊文献+

基于耦合度量的多尺度聚类挖掘方法 被引量:10

Multi-scale Clustering Mining Method Based on Coupled Metric Similarity
下载PDF
导出
摘要 为了能够更好地对非独立同分布的多尺度分类型数据集进行研究,基于无监督耦合度量相似性方法,提出针对非独立同分布的分类属性型数据集的多尺度聚类挖掘算法。首先,对基准尺度数据集进行基于耦合度量的基准尺度聚类;其次,提出基于单链的尺度上推和基于Lanczos核的尺度下推尺度转换算法;最后,利用公用数据集以及H省真实数据集进行实验验证。将耦合度量相似性(Couple metric similarity,CMS)、逆发生频率(Inverse occurrence frequency,IOF)、汉明距离(Hamming distance,HM)等方法与谱聚类结合作为对比算法,结果表明,尺度上推算法与对比算法相比,NMI值平均提高13.1%,MSE值平均减小0.827,F-score值平均提高12.8%;尺度下推算法NMI值平均提高19.2%,MSE值平均减小0.028,F-score值平均提高15.5%。实验结果表明,所提出的算法具有有效性和可行性。 To better study the non-independent and identically distributed multi-scale categorical data sets,based on the unsupervised coupling measure similarity method,a multi-scale clustering mining algorithm for non-independent and identically distributed classification attribute data sets is proposed.Firstly,the data set of benchmark scale is clustered based on coupled metric similarity method.Secondly,scale conversion algorithms upscaling based on single chain and downscaling based on Lanczos kernel are proposed for scale conversion.Finally,experiments are performed using the public data sets and the real data sets of the H province.In the experiment,couple metric similarity(CMS),inverse occurrence frequency(IOF),hamming distance(HM)and other similarity metric methods combined with spectral clustering algorithm are compared and the experimental results demonstrate that the NMI value of the upscaling increases by13.1%,the mean of MSE value reduces by 0.827,and the mean of F-score value increases by 12.8%.Compared with other comparison algorithms,the mean of NMI value of downscaling increases by 19.2%,the mean of MSE value reduces by 0.028,and the mean of F-score value increases by 15.5%.Experimental results and theoretical analysis show that the proposed algorithm is effective and feasible.
作者 田真真 赵书良 李文斌 张璐璐 陈润资 TIAN Zhenzhen;ZHAO Shuliang;LI Wenbin;ZHANG Lulu;CHEN Runzi(College of Computer and Cyber Security,Hebei Normal University,Shijiazhuang,050024,China;Hebei Provincial Engineering Research Center for Supply Chain Big Data Analytics&Data Security,Hebei Normal University,Shijiazhuang,050024,China;Key Laboratory of Network&Information Security,Hebei Normal University,Shijiazhuang,050024,China;College of Information Engineering,Hebei GEO University,Shijiazhuang,050031,China;School of Mathematical Sciences,Hebei Normal University,Shijiazhuang,050024,China)
出处 《数据采集与处理》 CSCD 北大核心 2020年第3期549-562,共14页 Journal of Data Acquisition and Processing
基金 国家社会科学基金重大(13&ZD091,18ZdA200)资助项目。
关键词 多尺度 聚类 分类数据 尺度转换 度量学习 multi-scale clustering categorical data scale conversion coupled metric similarity
  • 相关文献

参考文献5

二级参考文献28

共引文献48

同被引文献128

引证文献10

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部