摘要
研究了聚类分析技术在网络推荐系统中的应用。由于k均值(k-means)聚类算法易受到局部最优与噪声点等因素的影响,文章结合DBSCAN(Density-based Spatial Clustering of Application with Noise)算法和MMD(Max-Min Distance)初始聚类中心选取算法,对原始k-means算法进行了改进,提出了DMK(Density-based and Max-min-distance K-means)算法。该算法使用DBSCAN选取高密度点作为第一个聚类中心点的备选范围,接着选择相距最远的K-1个点作为其余的K-1个聚类中心,然后用得到的这组初始聚类中心进行k-means聚类。仿真与实验结果表明,该算法选择的初始聚类中心比较分散且代表性好,聚类的迭代次数减少,聚类结果的纯度提高。
In this paper, we study the application of clustering analysis in network recommendation systems. The k-means clustering algorithm is susceptible to local optimum and noise. This paper improved original k-means algorithm combined with DBSCAN and MMD algorithm and proposed a DMK algorithm. The DMK algorithm used DBSCAN to select high density points as the first cluster center's selection range, then used MMD algorithm to select the rest k-1 points that are farthest away from each other as the rest k-1 cluster centers. These k points are used as initial cluster centers to continue k-means clustering.The results of simulation and experiment showed that the DMK algorithm selected dispersed and representative initial cluster centers, the number of iterations decreased and the clustering results' purity raised.
出处
《信息通信》
2017年第7期34-36,共3页
Information & Communications
关键词
聚类
K-MEANS
DMK
clustering, k-means, Density-based and Max-min-distance K-means