摘要
针对基于核的多视图聚类算法(kernel based multi-view clustering method,MVKKM)在处理大规模数据集时运行时间长的缺点,引入增量聚类模型的概念,将MVKKM算法与增量聚类模型相结合,提出基于核K-means的多视图增量聚类算法(incremental multi-view clustering algorithm based on kernel K-means,IMVCKM)。通过将数据集分块,在每个数据块中使用M VKKM算法聚类,并将每个数据块的聚类中心作为下个数据块的初始聚类中心。将所有块的聚类中心进行整合后再次进行多视图聚类,得到最终的聚类结果。试验结果表明,在3个大规模数据集上,IMVCKM算法相较于MVKKM算法在3个评价指标上具有更好的聚类结果,且运行时间更短。该算法在保证聚类性能的基础上大大降低算法的运行时间。
Because of the defect of long running time in the kernel based multi-view clustering algorithm( MVKKM) when dealing with large-scale datasets,the concept of incremental clustering model was introduced. The incremental multi-view clustering algorithm based on kernel K-means( IMVKKM) was proposed by combining MVKKM algorithm and incremental clustering framework.The dataset was divided into chunks and the MVKKM method was used in each data chunk to obtain a set of cluster centers,which was regarded as the initial cluster center of the next chunk. The cluster centers of all the chunks were combined and the final set of cluster result was identified by using MVKKM. The experimental results showed that IMVKKM algorithm had better clustering results and shorter running time than MVKKM algorithm on three large-scale datasets. The proposed approach could reduce the running time while keeping the clustering performance.
作者
张佩瑞
杨燕
邢焕来
喻琇瑛
ZHANG Peirui;YANG Yan;XING Huanlai;YU Xiuying(School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611756, Sichuan, China)
出处
《山东大学学报(工学版)》
CAS
北大核心
2018年第3期48-53,共6页
Journal of Shandong University(Engineering Science)
基金
国家自然科学基金资助项目(61572407)
国家科技支撑计划课题资助项目(2015BAH19F02)
关键词
多视图聚类
核函数
多视图核K-means
增量聚类
数据块
聚类中心
multi-view clusterting
kernel function
multi-view kernel K-means
incremental clustering
dataset chunk
cluster center