摘要
为了提高并行遗传算法在大数据聚类问题中的时间效率,通过利用粗粒度遗传算法的并行化思想,提出了Hadoop平台上基于MapReduce计算框架的粗粒度遗传算法的并行化设计。该思想主要来源于大数据体量庞大的特点,聚类算法时间消耗巨大。并行是解决算力不足的一个较为有效的方法,实验结果表明,并行化的遗传算法在处理大数据聚类时相比传统的串行化处理在时间消耗方面有明显的降低。
Parallel design of coarse grain genetic algorithm based on MapReduce computing framework is proposed in the Hadoop to improve the time efficiency of parallel genetic algorithm in large data clustering, by using the idea of parallel genetic algorithm. This idea is mainly derived from the huge amount of large data, a huge amount of time consumption of clustering algorithm. Parallelism is the solution to the lack of a more effective method. Experimental results show that parallel genetic algorithm in dealing with large data clustering compared to the traditional serial processing in time consumption has decreased significantly.
作者
郭晨晨
朱红康
GUO Chen-Chen ZHU Hong-Kang(School of mathematics and computer science, Shanxi Normal University, Linfen 041000, China)
出处
《黑龙江大学工程学报》
2016年第3期87-91,共5页
Journal of Engineering of Heilongjiang University
基金
山西省自然科学基金资助项目(2015011040)