摘要
针对多核CPU和GPU环境下图的深度优先搜索问题,提出多核CPU中实现并行DFS的新算法,通过有效利用内存带宽来提高性能,且当图增大时优势越明显。在此基础上提出一种混合方法,为DFS每一分支动态地选择最佳的实现:顺序执行;两种不同算法的多核执行;GPU执行。混合算法为每种大小的图提供相对更好的性能,且能避免高直径图上的最坏情况。通过比较多CPU和GPU系统,分析底层架构对DFS性能的影响。实验结果表明,一个高端single-socket GPU系统的DFS执行性能相当于一个高端4-socket CPU系统。
In order to solve the depth first search on multi-core CPU and GPU environment, this paper put forward a kind of parallel DFS algorithm on muhieore CPU . Through effective utilization of memory bandwidth to improve performance, and en- hanced its advantage as the size of the graph increased. Then the paper proposed a hybrid method which offered dynamical choices from a sequential execution, two different algorithms of multi-core execution, and a GPU execution, for each branch of DFS best implementation. Such hybrid method could provide the best performance for each size of the graph, and avoided the worst-case performance on high-diameter graphs. Finally, the paper compared the multiple CPU and GPU systems to analyse the influence of the underlying architecture on DFS. Experimental results show that a high-end GPU system on DFS perform as well as a quad-socket high-end CPU system.
出处
《计算机应用研究》
CSCD
北大核心
2014年第10期2982-2985,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(61370095
61370098
61070057
90715029)
湖南省教育厅科学研究项目(13C074)
衡阳市科技局科技发展计划项目(2011KJ22)
湖南省教育科学"十二五"规划课题(XJK014CGD006)
关键词
多核CPU
GPU
深度优先搜索
并行
异构
multi-core CPU
GPU
depth first search(DFS)
parallel
heterogeneous