摘要
基于传统编程模型的K均值聚类算法是典型的串行算法,对大数据聚类时性能不佳,为获得令人满意的大数据聚类性能要求,解决K均值聚类算法的固有不足,提出一个基于MPI的蜂群K均值进行并行化聚类的算法。结合改进的蜂群算法和K均值迭代,使算法的全局寻优能力得到提高,降低初始聚类中心对算法聚类质量的影响,对该算法做MPI并行化改进,实现基于MPI的蜂群聚类算法并行计算。通过对串行与并行蜂群K均值聚类算法分别进行仿真实验验证,得到了并行蜂群K均值聚类算法在效率和性能上更优的结论。
K-means clustering algorithm is a serial algorithm based traditional programming model,it is inefficient when clustering over big data,aparallel bee colony K-means clustering algorithm was proposed based on MPI to improve the performance.The global optimization ability was improved,the impact of initial cluster centers on quality of clustering was reduced through combining the K-means and improved artificial bee colony algorithm,and parallelization of the proposed algorithm based on MPI was realized.Experiments were implemented on serial algorithm and parallel algorithm respectively.The results show that the serial algorithm is more efficient than serial computing.
出处
《计算机工程与设计》
北大核心
2017年第12期3339-3343,共5页
Computer Engineering and Design
基金
国家社会科学基金项目(15XTQ010)
广西高校科学技术研究基金项目(KY2015YB351)
广西经济管理干部学院科研启动费基金项目
国家自然科学基金项目(61364020)