In the graph signal processing(GSP)framework,distributed algorithms are highly desirable in processing signals defined on large-scale networks.However,in most existing distributed algorithms,all nodes homogeneously pe...In the graph signal processing(GSP)framework,distributed algorithms are highly desirable in processing signals defined on large-scale networks.However,in most existing distributed algorithms,all nodes homogeneously perform the local computation,which calls for heavy computational and communication costs.Moreover,in many real-world networks,such as those with straggling nodes,the homogeneous manner may result in serious delay or even failure.To this end,we propose active network decomposition algorithms to select non-straggling nodes(normal nodes)that perform the main computation and communication across the network.To accommodate the decomposition in different kinds of networks,two different approaches are developed,one is centralized decomposition that leverages the adjacency of the network and the other is distributed decomposition that employs the indicator message transmission between neighboring nodes,which constitutes the main contribution of this paper.By incorporating the active decomposition scheme,a distributed Newton method is employed to solve the least squares problem in GSP,where the Hessian inverse is approximately evaluated by patching a series of inverses of local Hessian matrices each of which is governed by one normal node.The proposed algorithm inherits the fast convergence of the second-order algorithms while maintains low computational and communication cost.Numerical examples demonstrate the effectiveness of the proposed algorithm.展开更多
In previous papers, the stationary distributions of a class of discrete and continuoustime random graph processes with state space consisting of the simple and directed graphs on Nvenices were studied. In this paper, ...In previous papers, the stationary distributions of a class of discrete and continuoustime random graph processes with state space consisting of the simple and directed graphs on Nvenices were studied. In this paper, the random graph graph process is extended one impotent stepfurther by allowing interaction of edges. Similarly, We obtha the expressions of the stationarydistributions and prove that the process is ergodic under different editions.展开更多
With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes ...With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes difficult.The multi-FPGA acceleration is of great necessity and importance.Many cloud providers(e.g.,Amazon,Microsoft,and Baidu)now expose FPGAs to users in their data centers,providing opportunities to accelerate large-scale graph processing.In this paper,we present a communication library,called FDGLib,which can easily scale out any existing single FPGA-based graph accelerator to a distributed version in a data center,with minimal hardware engineering efforts.FDGLib provides six APIs that can be easily used and integrated into any FPGA-based graph accelerator with only a few lines of code modifications.Considering the torus-based FPGA interconnection in data centers,FDGLib also improves communication efficiency using simple yet effective torus-friendly graph partition and placement schemes.We interface FDGLib into AccuGraph,a state-of-the-art graph accelerator.Our results on a 32-node Microsoft Catapult-like data center show that the distributed AccuGraph can be 2.32x and 4.77x faster than a state-of-the-art distributed FPGA-based graph accelerator ForeGraph and a distributed CPU-based graph system Gemini,with better scalability.展开更多
This paper preseflts a new approach of the synchronous homogeneous concurrent propagation of competitive waves for the purpose of hyper-distributed hyper-parallel heuristic problem-solving. The concurrent algorithm, m...This paper preseflts a new approach of the synchronous homogeneous concurrent propagation of competitive waves for the purpose of hyper-distributed hyper-parallel heuristic problem-solving. The concurrent algorithm, mechanism and their properties are given. In comparison with the traditional AI algorithms, the approach is featured by the knowledge-based problem-solving in the distributed parallel environment, the feasibility for hardware implementation and the various applications.展开更多
This paper proposes an asynchronous heterogeneous propagation approach of concurrent competitive waves for hyper-distributed hyper-parallel heuris tic problem-solving. This approach is much more powerful than the sync...This paper proposes an asynchronous heterogeneous propagation approach of concurrent competitive waves for hyper-distributed hyper-parallel heuris tic problem-solving. This approach is much more powerful than the synchronous homogeneous mechanisms and the asynchronous superimposition algorithms, and has universal validity and availability. The basic conception, concurrent algorithm and its properties are discussed. The theory and conclusions drawn in this paper are of essential importance for the hardware implementation of hyper-distributed hyper-parallel processing based on chaotic cellular networks.展开更多
With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintainin...With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintaining and processing of dynamic graph is significantly high. In this pa- per, we design iGraph, an incremental graph processing sys- tem for dynamic graph with its continuous updates. The con- tribufions of iGraph include: 1) a hash-based graph partition strategy to enable fine-grained graph updates; 2) a vertex- based graph computing model to support incremental data processing; 3) detection and rebalance methods of hotspot to address the workload imbalance problem during incre- mental processing. Through the general-purpose API, iGraph can be used to implement various graph processing algo- rithms such as PageRank. We have implemented iGraph on Apache Spark, and experimental results show that for real life datasets, iGraph outperforms the original GraphX in respect of graph update and graph computation.展开更多
基金supported by National Natural Science Foundation of China(Grant No.61761011)Natural Science Foundation of Guangxi(Grant No.2020GXNSFBA297078).
文摘In the graph signal processing(GSP)framework,distributed algorithms are highly desirable in processing signals defined on large-scale networks.However,in most existing distributed algorithms,all nodes homogeneously perform the local computation,which calls for heavy computational and communication costs.Moreover,in many real-world networks,such as those with straggling nodes,the homogeneous manner may result in serious delay or even failure.To this end,we propose active network decomposition algorithms to select non-straggling nodes(normal nodes)that perform the main computation and communication across the network.To accommodate the decomposition in different kinds of networks,two different approaches are developed,one is centralized decomposition that leverages the adjacency of the network and the other is distributed decomposition that employs the indicator message transmission between neighboring nodes,which constitutes the main contribution of this paper.By incorporating the active decomposition scheme,a distributed Newton method is employed to solve the least squares problem in GSP,where the Hessian inverse is approximately evaluated by patching a series of inverses of local Hessian matrices each of which is governed by one normal node.The proposed algorithm inherits the fast convergence of the second-order algorithms while maintains low computational and communication cost.Numerical examples demonstrate the effectiveness of the proposed algorithm.
文摘In previous papers, the stationary distributions of a class of discrete and continuoustime random graph processes with state space consisting of the simple and directed graphs on Nvenices were studied. In this paper, the random graph graph process is extended one impotent stepfurther by allowing interaction of edges. Similarly, We obtha the expressions of the stationarydistributions and prove that the process is ergodic under different editions.
基金supported by the National Key Research and Development Program of China under Grant No.2018YFB1003502the National Natural Science Foundation of China under Grant Nos.62072195,61825202,61832006,and 61628204.
文摘With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes difficult.The multi-FPGA acceleration is of great necessity and importance.Many cloud providers(e.g.,Amazon,Microsoft,and Baidu)now expose FPGAs to users in their data centers,providing opportunities to accelerate large-scale graph processing.In this paper,we present a communication library,called FDGLib,which can easily scale out any existing single FPGA-based graph accelerator to a distributed version in a data center,with minimal hardware engineering efforts.FDGLib provides six APIs that can be easily used and integrated into any FPGA-based graph accelerator with only a few lines of code modifications.Considering the torus-based FPGA interconnection in data centers,FDGLib also improves communication efficiency using simple yet effective torus-friendly graph partition and placement schemes.We interface FDGLib into AccuGraph,a state-of-the-art graph accelerator.Our results on a 32-node Microsoft Catapult-like data center show that the distributed AccuGraph can be 2.32x and 4.77x faster than a state-of-the-art distributed FPGA-based graph accelerator ForeGraph and a distributed CPU-based graph system Gemini,with better scalability.
文摘This paper preseflts a new approach of the synchronous homogeneous concurrent propagation of competitive waves for the purpose of hyper-distributed hyper-parallel heuristic problem-solving. The concurrent algorithm, mechanism and their properties are given. In comparison with the traditional AI algorithms, the approach is featured by the knowledge-based problem-solving in the distributed parallel environment, the feasibility for hardware implementation and the various applications.
文摘This paper proposes an asynchronous heterogeneous propagation approach of concurrent competitive waves for hyper-distributed hyper-parallel heuris tic problem-solving. This approach is much more powerful than the synchronous homogeneous mechanisms and the asynchronous superimposition algorithms, and has universal validity and availability. The basic conception, concurrent algorithm and its properties are discussed. The theory and conclusions drawn in this paper are of essential importance for the hardware implementation of hyper-distributed hyper-parallel processing based on chaotic cellular networks.
文摘With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintaining and processing of dynamic graph is significantly high. In this pa- per, we design iGraph, an incremental graph processing sys- tem for dynamic graph with its continuous updates. The con- tribufions of iGraph include: 1) a hash-based graph partition strategy to enable fine-grained graph updates; 2) a vertex- based graph computing model to support incremental data processing; 3) detection and rebalance methods of hotspot to address the workload imbalance problem during incre- mental processing. Through the general-purpose API, iGraph can be used to implement various graph processing algo- rithms such as PageRank. We have implemented iGraph on Apache Spark, and experimental results show that for real life datasets, iGraph outperforms the original GraphX in respect of graph update and graph computation.