摘要
近邻传播算法(AP)不需要事先指定聚类数目,在程序运行过程中,能够自动识别聚类中心及聚类数目。在同一批数据集上,AP算法聚类结果稳定,鲁棒性好。除此之外,AP聚类算法可以采用多种距离度量方式,聚类结果精确。针对近邻传播算法(AP)不能对异构数据进行聚类的问题,提出一种基于张量距离的高阶AP聚类算法。该算法首先利用张量表示异构数据对象,然后将张量距离引入AP聚类算法,用来度量异构数据对象在张量空间的相似度。张量距离的引入,不但能够度量异构数据对象在数值上的差异,同时能够度量异构数据对象在高阶空间中位置的差异性,有效的捕捉异构数据对象的分布特征。实验结果表示,提出的高阶AP算法能够有效的对异构数据对象进行聚类。
Affinity propagation(AP)algorithm does not need to specify the number of clustering.When running the program,it can automatically identify the clustering center and the number of clustering.On the same data set,the result of AP clustering algorithm is stable and has good robustness.In addition,AP clustering algorithm can get accurate clustering results by using a variety of distance measuring methods.But current affinity propagation algorithm cannot be applied to heterogeneous data clustering.Aiming at this problem,the paper proposes a high-order affinity propagation algorithm based on tensor distance(HTDAP)for clustering heterogeneous data.The proposed algorithm represents each heterogeneous data object by the tensor,and introduces the tensor distance to measure the similarity between two objects.The tensor distance can capture the distribution features of the heterogeneous data sets in the high-order space by calculating the distance of the numerical values between the objects and measure the difference among the coordinate positions.Experimental results show the proposed scheme is effective in heterogeneous data clustering.
出处
《沈阳师范大学学报(自然科学版)》
CAS
2016年第1期96-99,共4页
Journal of Shenyang Normal University:Natural Science Edition
基金
辽宁省科技厅高等学校本科专业设置预测系统研究项目(辽教函[1008]225号)
关键词
聚类
异构数据
张量距离
AP算法
clustering
heterogeneous data
tensor distance
affinity propagation