Hybrid pull-push computational model can provide compelling results over either of single one for processing real-world graphs.Programmability and pipeline parallelism of FPGAs make it potential to process different s...Hybrid pull-push computational model can provide compelling results over either of single one for processing real-world graphs.Programmability and pipeline parallelism of FPGAs make it potential to process different stages of graph iterations.Nevertheless,considering the limited on-chip resources and streamline pipeline computation,the efficiency of hybrid model on FPGAs often suffers due to well-known random access feature of graph processing.In this paper,we present a hybrid graph processing system on FPGAs,which can achieve the best of both worlds.Our approach on FPGAs is unique and novel as follow.First,we propose to use edge block(consisting of edges with the same destination vertex set),which allows to sequentially access edges at block granularity for locality while still preserving the precision.Due to the independence of blocks in the sense that all edges in an inactive block are associated with inactive vertices,this also enables to skip invalid blocks for reducing redundant computation.Second,we consider a large number of vertices and their associated edge-blocks to maintain a predictable execution history.We also present to switch models in advance with few stalls using their state statistics.Our evaluation on a wide variety of graph algorithms for many real-world graphs shows that our approach achieves up to 3.69x speedup over state-of-the-art FPGA-based graph processing systems.展开更多
基金This work was supported by the National Key Research and Development Program of China(2018YFB1003502)the National Natural Science Foundation of China(Grant Nos.61825202,61832006,and 61702201).
文摘Hybrid pull-push computational model can provide compelling results over either of single one for processing real-world graphs.Programmability and pipeline parallelism of FPGAs make it potential to process different stages of graph iterations.Nevertheless,considering the limited on-chip resources and streamline pipeline computation,the efficiency of hybrid model on FPGAs often suffers due to well-known random access feature of graph processing.In this paper,we present a hybrid graph processing system on FPGAs,which can achieve the best of both worlds.Our approach on FPGAs is unique and novel as follow.First,we propose to use edge block(consisting of edges with the same destination vertex set),which allows to sequentially access edges at block granularity for locality while still preserving the precision.Due to the independence of blocks in the sense that all edges in an inactive block are associated with inactive vertices,this also enables to skip invalid blocks for reducing redundant computation.Second,we consider a large number of vertices and their associated edge-blocks to maintain a predictable execution history.We also present to switch models in advance with few stalls using their state statistics.Our evaluation on a wide variety of graph algorithms for many real-world graphs shows that our approach achieves up to 3.69x speedup over state-of-the-art FPGA-based graph processing systems.