期刊文献+
共找到12篇文章
< 1 >
每页显示 20 50 100
BAR:a branch-alternation-resorting algorithm for locality exploration in graph processing
1
作者 邓军勇 WANG Junjie +2 位作者 JIANG Lin XIE Xiaoyan ZHOU Kai 《High Technology Letters》 EI CAS 2024年第1期31-42,共12页
Unstructured and irregular graph data causes strong randomness and poor locality of data accesses in graph processing.This paper optimizes the depth-branch-resorting algorithm(DBR),and proposes a branch-alternation-re... Unstructured and irregular graph data causes strong randomness and poor locality of data accesses in graph processing.This paper optimizes the depth-branch-resorting algorithm(DBR),and proposes a branch-alternation-resorting algorithm(BAR).In order to make the algorithm run in parallel and improve the efficiency of algorithm operation,the BAR algorithm is mapped onto the reconfigurable array processor(APR-16)to achieve vertex reordering,effectively improving the locality of graph data.This paper validates the BAR algorithm on the GraphBIG framework,by utilizing the reordered dataset with BAR on breadth-first search(BFS),single source shortest paht(SSSP)and betweenness centrality(BC)algorithms for traversal.The results show that compared with DBR and Corder algorithms,BAR can reduce execution time by up to 33.00%,and 51.00%seperatively.In terms of data movement,the BAR algorithm has a maximum reduction of 39.00%compared with the DBR algorithm and 29.66%compared with Corder algorithm.In terms of computational complexity,the BAR algorithm has a maximum reduction of 32.56%compared with DBR algorithm and53.05%compared with Corder algorithm. 展开更多
关键词 graph processing vertex reordering branch-alternation-resorting algorithm(BAR) reconfigurable array processor
下载PDF
Towards High-Performance Graph Processing: From a Hardware/Software Co-Design Perspective
2
作者 廖小飞 赵文举 +7 位作者 金海 姚鹏程 黄禹 王庆刚 赵进 郑龙 张宇 邵志远 《Journal of Computer Science & Technology》 SCIE EI CSCD 2024年第2期245-266,共22页
Graph processing has been widely used in many scenarios,from scientific computing to artificial intelligence.Graph processing exhibits irregular computational parallelism and random memory accesses,unlike traditional ... Graph processing has been widely used in many scenarios,from scientific computing to artificial intelligence.Graph processing exhibits irregular computational parallelism and random memory accesses,unlike traditional workloads.Therefore,running graph processing workloads on conventional architectures(e.g.,CPUs and GPUs)often shows a significantly low compute-memory ratio with few performance benefits,which can be,in many cases,even slower than a specialized single-thread graph algorithm.While domain-specific hardware designs are essential for graph processing,it is still challenging to transform the hardware capability to performance boost without coupled software codesigns.This article presents a graph processing ecosystem from hardware to software.We start by introducing a series of hardware accelerators as the foundation of this ecosystem.Subsequently,the codesigned parallel graph systems and their distributed techniques are presented to support graph applications.Finally,we introduce our efforts on novel graph applications and hardware architectures.Extensive results show that various graph applications can be efficiently accelerated in this graph processing ecosystem. 展开更多
关键词 graph processing hardware accelerator software system high performance ECOSYSTEM
原文传递
Big Data Analytics Using Graph Signal Processing
3
作者 Farhan Amin Omar M.Barukab Gyu Sang Choi 《Computers, Materials & Continua》 SCIE EI 2023年第1期489-502,共14页
The networks are fundamental to our modern world and they appear throughout science and society.Access to a massive amount of data presents a unique opportunity to the researcher’s community.As networks grow in size ... The networks are fundamental to our modern world and they appear throughout science and society.Access to a massive amount of data presents a unique opportunity to the researcher’s community.As networks grow in size the complexity increases and our ability to analyze them using the current state of the art is at severe risk of failing to keep pace.Therefore,this paper initiates a discussion on graph signal processing for large-scale data analysis.We first provide a comprehensive overview of core ideas in Graph signal processing(GSP)and their connection to conventional digital signal processing(DSP).We then summarize recent developments in developing basic GSP tools,including methods for graph filtering or graph learning,graph signal,graph Fourier transform(GFT),spectrum,graph frequency,etc.Graph filtering is a basic task that allows for isolating the contribution of individual frequencies and therefore enables the removal of noise.We then consider a graph filter as a model that helps to extend the application of GSP methods to large datasets.To show the suitability and the effeteness,we first created a noisy graph signal and then applied it to the filter.After several rounds of simulation results.We see that the filtered signal appears to be smoother and is closer to the original noise-free distance-based signal.By using this example application,we thoroughly demonstrated that graph filtration is efficient for big data analytics. 展开更多
关键词 Big data data science big data processing graph signal processing social networks
下载PDF
An effective framework for asynchronous incremental graph processing 被引量:4
4
作者 Xinqiao LV Wei XIAO +3 位作者 Yu ZHANG Xiaofei LIAO Hai JIN Qiangsheng HUA 《Frontiers of Computer Science》 SCIE EI CSCD 2019年第3期539-551,共13页
Although many graph processing systems have been proposed, graphs in the real-world are often dynamic. It is important to keep the results of graph computation up-todate. Incremental computation is demonstrated to be ... Although many graph processing systems have been proposed, graphs in the real-world are often dynamic. It is important to keep the results of graph computation up-todate. Incremental computation is demonstrated to be an efficient solution to update calculated results. Recently, many incremental graph processing systems have been proposed to handle dynamic graphs in an asynchronous way and are able to achieve better performance than those processed in a synchronous way. However, these solutions still suffer from sub-optimal convergence speed due to their slow propagation of important vertex state (important to convergence speed) and poor locality. In order to solve these problems, we propose a novel graph processing framework. It introduces a dynamic partition method to gather the important vertices for high locality, and then uses a priority-based scheduling algorithm to assign them with a higher priority for an effective processing order. By such means, it is able to reduce the number of updates and increase the locality, thereby reducing the convergence time. Experimental results show that our method reduces the number of updates by 30%, and reduces the total execution time by 35%, compared with state-of-the-art systems. 展开更多
关键词 incremental computation graph processing iterative computation ASYNCHRONOUS CONVERGENCE
原文传递
Efficient FPGA-based graph processing with hybrid pull-push computational model 被引量:1
5
作者 Chengbo YANG Long ZHENG +1 位作者 Chuangyi GUI Hai JIN 《Frontiers of Computer Science》 SCIE EI CSCD 2020年第4期13-28,共16页
Hybrid pull-push computational model can provide compelling results over either of single one for processing real-world graphs.Programmability and pipeline parallelism of FPGAs make it potential to process different s... Hybrid pull-push computational model can provide compelling results over either of single one for processing real-world graphs.Programmability and pipeline parallelism of FPGAs make it potential to process different stages of graph iterations.Nevertheless,considering the limited on-chip resources and streamline pipeline computation,the efficiency of hybrid model on FPGAs often suffers due to well-known random access feature of graph processing.In this paper,we present a hybrid graph processing system on FPGAs,which can achieve the best of both worlds.Our approach on FPGAs is unique and novel as follow.First,we propose to use edge block(consisting of edges with the same destination vertex set),which allows to sequentially access edges at block granularity for locality while still preserving the precision.Due to the independence of blocks in the sense that all edges in an inactive block are associated with inactive vertices,this also enables to skip invalid blocks for reducing redundant computation.Second,we consider a large number of vertices and their associated edge-blocks to maintain a predictable execution history.We also present to switch models in advance with few stalls using their state statistics.Our evaluation on a wide variety of graph algorithms for many real-world graphs shows that our approach achieves up to 3.69x speedup over state-of-the-art FPGA-based graph processing systems. 展开更多
关键词 graph processing EFFICIENCY computational model FPGAS
原文传递
FDGLib: A Communication Library for Efficient Large-Scale Graph Processing in FPGA-Accelerated Data Centers
6
作者 Yu-Wei Wu Qing-Gang Wang +5 位作者 Long Zheng Xiao-Fei Liao Hai Jin Wen-Bin Jiang Ran Zheng Kan Hu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第5期1051-1070,共20页
With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes ... With the rapid growth of real-world graphs,the size of which can easily exceed the on-chip(board)storage capacity of an accelerator,processing large-scale graphs on a single Field Programmable Gate Array(FPGA)becomes difficult.The multi-FPGA acceleration is of great necessity and importance.Many cloud providers(e.g.,Amazon,Microsoft,and Baidu)now expose FPGAs to users in their data centers,providing opportunities to accelerate large-scale graph processing.In this paper,we present a communication library,called FDGLib,which can easily scale out any existing single FPGA-based graph accelerator to a distributed version in a data center,with minimal hardware engineering efforts.FDGLib provides six APIs that can be easily used and integrated into any FPGA-based graph accelerator with only a few lines of code modifications.Considering the torus-based FPGA interconnection in data centers,FDGLib also improves communication efficiency using simple yet effective torus-friendly graph partition and placement schemes.We interface FDGLib into AccuGraph,a state-of-the-art graph accelerator.Our results on a 32-node Microsoft Catapult-like data center show that the distributed AccuGraph can be 2.32x and 4.77x faster than a state-of-the-art distributed FPGA-based graph accelerator ForeGraph and a distributed CPU-based graph system Gemini,with better scalability. 展开更多
关键词 data center ACCELERATOR graph processing distributed architecture communication optimization
原文传递
Identifying influential nodes based on graph signal processing in complex networks 被引量:1
7
作者 赵佳 喻莉 +1 位作者 李静茹 周鹏 《Chinese Physics B》 SCIE EI CAS CSCD 2015年第5期639-648,共10页
Identifying influential nodes in complex networks is of both theoretical and practical importance. Existing methods identify influential nodes based on their positions in the network and assume that the nodes are homo... Identifying influential nodes in complex networks is of both theoretical and practical importance. Existing methods identify influential nodes based on their positions in the network and assume that the nodes are homogeneous. However, node heterogeneity (i.e., different attributes such as interest, energy, age, and so on ) ubiquitously exists and needs to be taken into consideration. In this paper, we conduct an investigation into node attributes and propose a graph signal pro- cessing based centrality (GSPC) method to identify influential nodes considering both the node attributes and the network topology. We first evaluate our GSPC method using two real-world datasets. The results show that our GSPC method effectively identifies influential nodes, which correspond well with the underlying ground truth. This is compatible to the previous eigenvector centrality and principal component centrality methods under circumstances where the nodes are homogeneous. In addition, spreading analysis shows that the GSPC method has a positive effect on the spreading dynamics. 展开更多
关键词 complex networks graph signal processing influential node identification
下载PDF
A Distributed Newton Method for Processing Signals Defined on the Large-Scale Networks
8
作者 Yanhai Zhang Junzheng Jiang +1 位作者 Haitao Wang Mou Ma 《China Communications》 SCIE CSCD 2023年第5期315-329,共15页
In the graph signal processing(GSP)framework,distributed algorithms are highly desirable in processing signals defined on large-scale networks.However,in most existing distributed algorithms,all nodes homogeneously pe... In the graph signal processing(GSP)framework,distributed algorithms are highly desirable in processing signals defined on large-scale networks.However,in most existing distributed algorithms,all nodes homogeneously perform the local computation,which calls for heavy computational and communication costs.Moreover,in many real-world networks,such as those with straggling nodes,the homogeneous manner may result in serious delay or even failure.To this end,we propose active network decomposition algorithms to select non-straggling nodes(normal nodes)that perform the main computation and communication across the network.To accommodate the decomposition in different kinds of networks,two different approaches are developed,one is centralized decomposition that leverages the adjacency of the network and the other is distributed decomposition that employs the indicator message transmission between neighboring nodes,which constitutes the main contribution of this paper.By incorporating the active decomposition scheme,a distributed Newton method is employed to solve the least squares problem in GSP,where the Hessian inverse is approximately evaluated by patching a series of inverses of local Hessian matrices each of which is governed by one normal node.The proposed algorithm inherits the fast convergence of the second-order algorithms while maintains low computational and communication cost.Numerical examples demonstrate the effectiveness of the proposed algorithm. 展开更多
关键词 graph signal processing distributed Newton method active network decomposition secondorder algorithm
下载PDF
A survey on dynamic graph processing on GPUs: concepts, terminologies and systems
9
作者 Hongru GAO Xiaofei LIAO +3 位作者 Zhiyuan SHAO Kexin LI Jiajie CHEN Hai JIN 《Frontiers of Computer Science》 SCIE EI 2024年第4期1-23,共23页
Graphs that are used to model real-world entities with vertices and relationships among entities with edges,have proven to be a powerful tool for describing real-world problems in applications.In most real-world scena... Graphs that are used to model real-world entities with vertices and relationships among entities with edges,have proven to be a powerful tool for describing real-world problems in applications.In most real-world scenarios,entities and their relationships are subject to constant changes.Graphs that record such changes are called dynamic graphs.In recent years,the widespread application scenarios of dynamic graphs have stimulated extensive research on dynamic graph processing systems that continuously ingest graph updates and produce up-to-date graph analytics results.As the scale of dynamic graphs becomes larger,higher performance requirements are demanded to dynamic graph processing systems.With the massive parallel processing power and high memory bandwidth,GPUs become mainstream vehicles to accelerate dynamic graph processing tasks.GPU-based dynamic graph processing systems mainly address two challenges:maintaining the graph data when updates occur(i.e.,graph updating)and producing analytics results in time(i.e.,graph computing).In this paper,we survey GPU-based dynamic graph processing systems and review their methods on addressing both graph updating and graph computing.To comprehensively discuss existing dynamic graph processing systems on GPUs,we first introduce the terminologies of dynamic graph processing and then develop a taxonomy to describe the methods employed for graph updating and graph computing.In addition,we discuss the challenges and future research directions of dynamic graph processing on GPUs. 展开更多
关键词 dynamic graphs graph processing graph algorithms GPUs
原文传递
A disk I/O optimized system for concurrent graph processing jobs
10
作者 Xianghao XU Fang WANG +3 位作者 Hong JIANG Yongli CHENG Dan FENG Peng FANG 《Frontiers of Computer Science》 SCIE EI 2024年第3期13-29,共17页
In order to analyze and process the large graphs with high cost efficiency,researchers have developed a number of out-of-core graph processing systems in recent years based on just one commodity computer.On the other ... In order to analyze and process the large graphs with high cost efficiency,researchers have developed a number of out-of-core graph processing systems in recent years based on just one commodity computer.On the other hand,with the rapidly growing need of analyzing graphs in the real-world,graph processing systems have to efficiently handle massive concurrent graph processing(CGP)jobs.Unfortunately,due to the inherent design for single graph processing job,existing out-of-core graph processing systems usually incur unnecessary data accesses and severe competition of I/O bandwidth when handling the CGP jobs.In this paper,we propose GraphCP,a disk I/O optimized out-of-core graph processing system that efficiently supports the processing of CGP jobs.GraphCP proposes a benefit-aware sharing execution model to share the I/O access and processing of graph data among the CGP jobs and adaptively schedule the graph data loading based on the states of vertices,which efficiently overcomes above challenges faced by existing out-of-core graph processing systems.Moreover,GraphCP adopts a dependency-based future-vertex updating model so as to reduce disk I/Os in the future iterations.In addition,GraphCP organizes the graph data with a Source-Sorted Sub-Block graph representation for better processing capacity and I/O access locality.Extensive evaluation results show that GraphCP is 20.5×and 8.9×faster than two out-of-core graph processing systems GridGraph and GraphZ,and 3.5×and 1.7×faster than two state-of-art concurrent graph processing systems Seraph and GraphSO. 展开更多
关键词 graph processing disk I/O concurrent jobs
原文传递
iGraph: an incremental data processing system for dynamic graph 被引量:5
11
作者 Wuyang JU Jianxin LI +1 位作者 Weiren YU Richong ZHANG 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第3期462-476,共15页
With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintainin... With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintaining and processing of dynamic graph is significantly high. In this pa- per, we design iGraph, an incremental graph processing sys- tem for dynamic graph with its continuous updates. The con- tribufions of iGraph include: 1) a hash-based graph partition strategy to enable fine-grained graph updates; 2) a vertex- based graph computing model to support incremental data processing; 3) detection and rebalance methods of hotspot to address the workload imbalance problem during incre- mental processing. Through the general-purpose API, iGraph can be used to implement various graph processing algo- rithms such as PageRank. We have implemented iGraph on Apache Spark, and experimental results show that for real life datasets, iGraph outperforms the original GraphX in respect of graph update and graph computation. 展开更多
关键词 big data distributed system in-memory computing graph processing hotspot detection
原文传递
Automatic Circuit Extractor for HDL Description Using Program Slicing 被引量:1
12
作者 TunLi YangGuo Si-KunLi 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第5期718-728,共11页
Design extraction and reduction have been extensively used in modern VLSI design process. The extracted and reduced design can be efficiently processed by various applications, such as formal verification, simulation,... Design extraction and reduction have been extensively used in modern VLSI design process. The extracted and reduced design can be efficiently processed by various applications, such as formal verification, simulation, automatic test pattern generation (ATPG), etc. This paper presents a new circuit extraction method using program slicing technique, and develops an elegant theoretical basis based on program slicing for circuit extraction from Verilog description. The technique can obtain a chaining slice for given signals of interest. Compared with related researches, the main advantages of the method include that it is fine grain, it has no hardware description language (HDL) coding style limitation; it is precise and is capable of dealing with various Verilog constructions. The technique has been integrated with a commercial simulation environment and incorporated into a design process. The results of practical designs show the significant benefits of the approach. 展开更多
关键词 program slicing chaining slice process dependence graph circuit extraction VLSI functional verification
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部