期刊文献+

面向幂律图的动态图存储结构Power-PCSR

Power-PCSR:An Efficient Dynamic Graph Storage Structure for Power-law Graphs
下载PDF
导出
摘要 图数据在现实生活中广泛存在,且不断发生变化。传统高效的静态图存储方式——压缩行/列(Compressed Sparse Row/Column,CSR/CSR)存储方式在更新图数据时需要大量的数据迁移,不适用于动态图数据。而能够高效更新图数据的邻接表(Adjacency List,AL)存储方式往往带有大量的指针,导致其图数据读取和分析效率低。Packed Compressed Sparse Row(PCSR)是一种基于CSR的动态图存储结构。该结构在存储边数据时并不是采用连续无空隙数组,而是采用留有空槽的压缩存储阵列(Packed Memory Arrays,PMA)结构,便于边数据的插入。因此,PCSR支持高效图更新和图分析。但是,PCSR在存储幂律图时,其性能容易受大度数顶点的影响。为此,基于PCSR提出一种支持可高效更新和分析动态幂律图的图存储结构Power-PCSR。该结构将幂律图中度数较大的顶点单独存储在一个独立的PMA中,其他所有小度数顶点与PCSR一样存储在原PMA中。小度顶点变化导致的数据迁移不会触及大度数顶点,从而大大减少了数据迁移数量;同样,大度数顶点更新导致的数据迁移只限制在每个大度数顶点的PMA内部,不会涉及小度数顶点和其他大度数顶点的数据迁移。实验显示,Power-PCSR在分析图数据时与PCSR具有相似的性能,而在更新图数据时比PCSR快2倍。 Graph data is widespread in real life and changes over time.The traditional efficient static graph storage structure,compressed sparse row/column(CSR/CSC)requires a large amount of data migration when inserting/deleting edges to/from graphs,which is not suitable for dynamic graphs.Although the adjacency list(AL)is able to update graphs efficiently,it is inefficient in reading and analyzing graphs since it has a large number of pointers,which results in random memory access.PCSR is a novel dynamic graph storage structure based on CSR.It employs a packed memory arrays(PMA)to store the edges rather than a continuous array.Because there are empty slots in PMA,it is easier to insert/delete edges.Thus,packed compressed sparse row(PCSR)is efficient in both graph updating and analysis.However,we find that the performance of PCSR suffers from large degree vertices when storing power-law graphs.For this,this paper proposes a new graph storage structure based on PCSR,Power-PCSR,which supports efficient updating and analysis of dynamic power-law graphs.In Power-PCSR,each large-degree vertex is stored in an independent PMA separately,and other vertices with small degrees are stored in a PMA.The data migration caused by the small-degree vertices will not lead to the migration of large-degree vertices,thus greatly reducing the amount of data migration.Similarly,the data migration caused by the update of large-degree vertices is only limited to its PMA,and will not involve the data migration of other large-degree vertices and small-degree vertices.Experiments show that Power-PCSR has similar performance to PCSR when analyzing graphs,and is 2 times faster than PCSR when updating graph data.
作者 毛志雄 刘志楠 高叙宁 王蒙湘 巩树凤 张岩峰 MAO Zhixiong;LIU Zhinan;GAO Xuning;WANG Mengxiang;GONG Shufeng;ZHANG Yanfeng(School of Computer Science and Engineering,Northeastern University,Shenyang 110167,China;College of Medicine and Biological Information Engineering,Northeastern University,Shenyang 110167,China;China National Institute of Standardization,Beijing 100088,China)
出处 《计算机科学》 CSCD 北大核心 2024年第8期56-62,共7页 Computer Science
基金 国家自然科学基金青年科学基金(62202088)。
关键词 动态图存储 动态图更新 数据迁移 Power-PCSR 幂律图 Dynamic graph storage Dynamic graph updates Data migration Power-PCSR Power-law graph
  • 相关文献

参考文献2

二级参考文献62

  • 1Amazon SimpleDB. http://aws, amazon, com/simpledb/, 2011-8-10.
  • 2Connor Alexander G, Chrysanthis Panos K, Labrinidis Alexandros. Key key-value stores for efficiently processing graph data in the cloud//Proceedings of the GDM. Hannover, Germany, 2011:88-93.
  • 3Lordanov Borislav. HyperGraphDB: A generalized graph database//Proceedings of the IWGD. JiuZhai Valley, China, 2010:25-36.
  • 4Eifrem Emil. NOSQL: Scaling to size and scaling to complexity, http://blogs, neotechnology, com/emil/2009/11/ nosql-scaling tosize-and-scaling-to-complexity, html, 2009- 1-15.
  • 5Wu Sai, Jiang Da-Wei, Ooi Beng Chin et al. Efficient B-tree based indexing for cloud data proeessing//Proeeedings of the VLDB. Singapore, 2010: 1207-1218.
  • 6Wang Jin-Bao, Wu Sai, Gao Hong et al. Indexing multi dimensional data in a cloud system//Proceedings of the SIGMOD. Indianapolis, Indiana, USA, 2010: 591-602.
  • 7Tsatsanifos George, Sacharidis Dimitris, Sellis Timos et al. MIDAS: Multi-attribute indexing for distributed architecture systems//Proceedings of the SSTD. Minneapolis, MN, USA, 2011:168-185.
  • 8Aguilera M K, Golab W, Shah M A. A practical scalable distributed B-tree//Proceedings of the VLDB. Auckland, New Zealand, 2008: 598-609.
  • 9Zhang Xiang-Yu, Ai Jing, Wang Zhong-Yuan, Lu Jia-Heng et al. An efficient multi-dimensional index for cloud data management//Proceedings of the CloudDB. Hong Kong, China, 2009:17-24.
  • 10InfiniteGraph, the Distributed Graph Database. http:// www. infinitegraph, com/, 2011 -7 -29.

共引文献98

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部