摘要
数据挖掘中经常使用k-means算法,它是经常使用的一种聚类分析算法,但易受初始聚类中心和聚类个数k的影响。因此对近年从算法原理、关键技术和优缺点等方面提出的较有代表性的关于初始聚类中心和k值确定的改进的k-means算法进行了分析。并选用知名数据集对一些典型算法进行测试和应用。上述工作将为数据挖掘的研究提供有益的参考。
The classic algorithm of k-means is discussed,that is one of the most widespread methods in clustering based on the data mining. But it is sensitive to original clustering center and k. The research actuality and new progress in k-means algorithm in recent years are summarized in this paper. First,the analysis and induction of some representative improved kmeans algorithms about the original clustering center and optimization of k have been made from several aspects,such as the principle of algorithm,key technology. Second several typical k-means algorithms and known data sets are selected, experiments and applications are implemented. The above work can give a valuable reference for data mining.
出处
《长春理工大学学报(自然科学版)》
2010年第1期164-166,163,共4页
Journal of Changchun University of Science and Technology(Natural Science Edition)
基金
黑龙江省自然科学基金(F200603)