摘要
文章探讨了相关系数与距离相关系数的区别与联系,利用距离相关系数度量样本间相似度,进而度量样本类的结构;将样本间欧氏距离转化为样本间相似度,给出两种自适应转化方法,比较研究不同转化方法对谱聚类的影响。然后基于上述样本间相似度进行谱聚类,并以三个典型数据集进行数值实验。数值模拟结果显示:距离相关系数谱聚类算法简单有效,从一个新的视角度量样本间相似度,可将特征取值相关性强的样本聚为一类,各特征取值数量级一致;距离倒数谱聚类不依赖参数,简单有效;全局谱聚类与自适应谱聚类依赖未知参数且受影响较大,但参数选择没有一致做法,具有主观性;相对而言,稳定自适应谱聚类对参数取值范围的适应性更强。
This paper discusses the difference and relation between correlation coefficient and distance correlation coefficient,uses distance correlation coefficient to measure the similarity between samples,and then measures the structure of sample class.The Euclidean distance between samples is transformed into the similarity between samples,and two adaptive transformation methods are proposed to compare the effects of different transformation methods on spectral clustering.Finally,spectral clustering is performed based on the similarity between the above samples,and numerical experiment is conducted with three typical data sets.Numerical simulation results are shown as follows:The distance correlation coefficient spectral clustering algorithm is simple and effective,and by measuring the similarity between samples from a new perspective,samples with strong correlation of feature values can be grouped into one category,and the order of magnitude of each feature value is the same.Spectral clustering of reciprocal distance is independent of parameters,which is simple and effective.Global spectral clustering and adaptive spectral clustering rely on unknown parameters and are greatly affected,but there is no consistent approach to parameter selection,which is subjective.In contrast,stable adaptive spectral clustering is more adaptable to the range of parameter values.
作者
王丙参
魏艳华
李旭
Wang Bingcan;Wei Yanhua;Li Xu(School of Mathematics and Statistics,Tianshui Normal University,Tianshui Gansu 741001,China;School of Statistics,Capital University of Economics and Business,Beijing 100070,China)
出处
《统计与决策》
CSSCI
北大核心
2022年第15期22-28,共7页
Statistics & Decision
基金
国家自然科学基金资助项目(11665019,11671268)。
关键词
距离相关系数
相似度
谱聚类
自适应
评价指标
distance correlation coefficient
similarity
spectral clustering
self-adaptation
evaluation index