摘要
In order to solve the bottleneck problem of the traditional K-Medoids clustering algorithm facing to deal with massive data information at the time of memory capacity and processing speed of CPU, the paper proposed a parallel algorithm MapReduce programming model based on the research of K-Medoids algorithm. This algorithm increase the computation granularity and reduces the communication cost ratio based on the MapReduce model. The experimental results show that the improved parallel algorithm compared with other algorithms, speedup and operation efficiency is greatly enhanced.