摘要
为了解决样本数较大时,传统谱聚类算法执行特征分解消耗时间过大的问题,提出了一种无需特征分解的快速谱聚类算法,通过乘法更新迭代来降低时间开销。首先,利用Nystr?m方法进行随机采样,建立了采样矩阵和原始矩阵之间的关系;其次,基于乘法更新原理实现矩阵指示器矩阵的迭代更新;最后,在理论上对所设计算法进行了正确性和收敛性分析。在广泛使用的五个真实数据集和三个人工合成数据集上进行测试。实验结果表明,在真实数据集上,所提算法的标准互信息(NMI)平均值为0.45,与k-means聚类算法相比提高了12.50%;运行时间为61.73 s,与传统谱聚类算法相比减少了61.13%;而且表现性能优于层次聚类算法,验证了该算法的有效性。
The traditional spectral clustering algorithm needs too much time to perform eigen-decomposition when the number of samples is very large.In order to solve the problem,a fast spectral clustering algorithm without eigendecomposition was proposed to reduce the time overhead by multiplication update iteration.Firstly,the Nystr?m algorithm was used for random sampling in order to establish the relationship between the sampling matrix and the original matrix.Then,the indicator matrix was updated iteratively based on the principle of multiplication update iteration.Finally,the correctness and convergence analysis of the designed algorithm were given theoretically.The proposed algorithm was tested on five widely used real datasets and three synthetic datasets.Experimental results on real datasets show that:the average Normalized Mutual Information(NMI)of the proposed algorithm is 0.45,which is improved by 12.5%compared with that of the k-means clustering algorithm;the computing time of the proposed algorithm achieves 61.73 s,which is decreased by61.13%compared with that of the traditional spectral clustering algorithm;and the performance of the proposed algorithm is superior to that of the hierarchical clustering algorithm,which verify the effectiveness of the proposed algorithm.
作者
刘静姝
王莉
刘惊雷
LIU Jingshu;WANG Li;LIU Jinglei(College of Data Science,Taiyuan University of Technology,Jinzhong Shanxi 030600,China;School of Computer and Control Engineering,Yantai University,Yantai Shandong 264005,China)
出处
《计算机应用》
CSCD
北大核心
2020年第12期3413-3422,共10页
journal of Computer Applications
基金
国家自然科学基金资助项目(61872260)
山西省自然科学基金资助项目(201703D421013)。
关键词
谱聚类
Nystrom采样
收敛性分析
特征分解
乘法更新迭代
spectral clustering
Nystrom sampling
convergence analysis
eigen-decomposition
multiplication update iteration