摘要
聚类作为一种重要的数据挖掘方式,如何在海量数据下更快获得一个有理论保证的K-means的近似解是一个关键问题。首先,定义K-means问题并介绍相关背景;然后,从理论保证和加速两个方面分别介绍国内外先进研究成果;最后,总结现有成果并对未来面向大数据的K-means研究方向予以展望和预测。
Among all the clustering problems,the K-means problem is probably the most well-known one.How to obtain a theoretically guaranteed solution of K-means efficiently for the big data can be a key problem.This paper surveyed the progress of this problem.Firstly,this paper defined the K-means problem and introduced relevant backgrounds.Secondly,it introduced separately and described in details the techniques for theoretical guarantee and speed up.Finally,it summarized the main results and forecasted the future directions of K-means algorithms on big data.
作者
任远航
Ren Yuanhang(School of Information&Software Engineering,University of Electronic Science&Technology of China,Chengdu 610054,China)
出处
《计算机应用研究》
CSCD
北大核心
2020年第12期3528-3533,共6页
Application Research of Computers