摘要
为解决原始K-means算法随机选取初始聚类中心对聚类结果的影响较大的不足,提出了改进算法。采取基于采样选取聚类中心距离的规则,进行多次选择决定最终的初始聚类中心,使得改进后的算法受初始聚类中心选择的影响达到最小;同时,在选取初始聚类中心后,对初值进行数据标准化处理。将改进的K-means算法应用于销售行业,结果显示,改进后的算法比原始的算法在效率上得到了提高。
K-means clustering algorithm has a few deficiencies. For instance, the result of the algorithm is affected by initial clustering centre. In this paper an improvement was made in the selection of initial cluster center in K-means algorithm. By adopting the principle of selecting cluster center by sampling, this improved algorithm ultimately defines the initial cluster center after multiple choices, thus minimizing the effect of initial cluster center. Simultaneously, the initial data is standadized once the initial cluster center is selected. On this basis, the improved algorithm was applied to analyze the sell industry data to produce different characteristics of the sell areas. For different sell areas, enterprise can offer differentiated services or products according to the different characteristics of each area.
出处
《太原理工大学学报》
CAS
北大核心
2009年第3期236-239,共4页
Journal of Taiyuan University of Technology