摘要
图数据在现实生活中广泛存在,且不断发生变化。传统高效的静态图存储方式——压缩行/列(Compressed Sparse Row/Column,CSR/CSR)存储方式在更新图数据时需要大量的数据迁移,不适用于动态图数据。而能够高效更新图数据的邻接表(Adjacency List,AL)存储方式往往带有大量的指针,导致其图数据读取和分析效率低。Packed Compressed Sparse Row(PCSR)是一种基于CSR的动态图存储结构。该结构在存储边数据时并不是采用连续无空隙数组,而是采用留有空槽的压缩存储阵列(Packed Memory Arrays,PMA)结构,便于边数据的插入。因此,PCSR支持高效图更新和图分析。但是,PCSR在存储幂律图时,其性能容易受大度数顶点的影响。为此,基于PCSR提出一种支持可高效更新和分析动态幂律图的图存储结构Power-PCSR。该结构将幂律图中度数较大的顶点单独存储在一个独立的PMA中,其他所有小度数顶点与PCSR一样存储在原PMA中。小度顶点变化导致的数据迁移不会触及大度数顶点,从而大大减少了数据迁移数量;同样,大度数顶点更新导致的数据迁移只限制在每个大度数顶点的PMA内部,不会涉及小度数顶点和其他大度数顶点的数据迁移。实验显示,Power-PCSR在分析图数据时与PCSR具有相似的性能,而在更新图数据时比PCSR快2倍。
Graph data is widespread in real life and changes over time.The traditional efficient static graph storage structure,compressed sparse row/column(CSR/CSC)requires a large amount of data migration when inserting/deleting edges to/from graphs,which is not suitable for dynamic graphs.Although the adjacency list(AL)is able to update graphs efficiently,it is inefficient in reading and analyzing graphs since it has a large number of pointers,which results in random memory access.PCSR is a novel dynamic graph storage structure based on CSR.It employs a packed memory arrays(PMA)to store the edges rather than a continuous array.Because there are empty slots in PMA,it is easier to insert/delete edges.Thus,packed compressed sparse row(PCSR)is efficient in both graph updating and analysis.However,we find that the performance of PCSR suffers from large degree vertices when storing power-law graphs.For this,this paper proposes a new graph storage structure based on PCSR,Power-PCSR,which supports efficient updating and analysis of dynamic power-law graphs.In Power-PCSR,each large-degree vertex is stored in an independent PMA separately,and other vertices with small degrees are stored in a PMA.The data migration caused by the small-degree vertices will not lead to the migration of large-degree vertices,thus greatly reducing the amount of data migration.Similarly,the data migration caused by the update of large-degree vertices is only limited to its PMA,and will not involve the data migration of other large-degree vertices and small-degree vertices.Experiments show that Power-PCSR has similar performance to PCSR when analyzing graphs,and is 2 times faster than PCSR when updating graph data.
作者
毛志雄
刘志楠
高叙宁
王蒙湘
巩树凤
张岩峰
MAO Zhixiong;LIU Zhinan;GAO Xuning;WANG Mengxiang;GONG Shufeng;ZHANG Yanfeng(School of Computer Science and Engineering,Northeastern University,Shenyang 110167,China;College of Medicine and Biological Information Engineering,Northeastern University,Shenyang 110167,China;China National Institute of Standardization,Beijing 100088,China)
出处
《计算机科学》
CSCD
北大核心
2024年第8期56-62,共7页
Computer Science
基金
国家自然科学基金青年科学基金(62202088)。