摘要
图划分是大图数据并行计算的基础,目前主要采用分布式算法实现大图划分.非易失存储器(Non-Volatile Memory,NVM)速度接近动态随机存储器(Dynamic Random Access Memory,DRAM),且具有低功耗、高密度、低时延等优点,本文针对分布式图划分算法难以分析和调试等问题,设计了基于混合内存的单机图划分算法框架.作者提出了基于邻边结构的图划分结果动态缓存管理策略(AeFdy),以提高缓存区邻居节点的搜索效率.在17种真实应用数据上的实验结果表明,采用新方法的平均图划分速度是基于邻点结构算法的4.9倍.本文还针对NVM寿命有限的问题,设计了基于内存页读写特征的迁移算法,实现了NVM写操作受限条件下的迁移优化方案.相对于Linux Swap、M-CLOCK、Dr.Swap混合内存管理策略,使用AeFdy策略的性能分别提升了128.5%、87.4%与50.4%.仿真实验结果表明,本文设计的混合内存管理方法实现了NVM+DRAM高效协同.
The tremendous scale of modern graph datasets has rapidly increased the demand for efficient algorithms for graph analysis.A standard approach distributes the graph over a cluster of computer nodes,but the performing computations on a distributed graph is expensive if the large amount of data has to be moved.Graph partitioning is the precondition of distributed graph computing framework,which is a key problem in improving the performance of distributed graph computing.Streaming graph partitioning is more efficient compared with offline partitioning,it has been developed continuously in the application of graph partitioning in recent years.Because of the limitation of memory capacity,the single commodity type computer is difficult to partition and optimize the massive graphs.Existing methods mainly use distributed cluster to process these large graph partitioning,while distributed computational resources have become more accessible,developing distributed graph partitioning algorithms still remains challenging,especially to non-experts.NVM storage has the advantages of low power consumption,high density,low latency and byte-addressable,which is the construction of high performance storage system and an important means to improve the performance big data system.But NVM storage also has some disadvantages,such as writing power consumption is higher than reading,write latency is higher than read,and the write counts of NVM is limited.The asymmetry problem about read and write in NVM should be paid more attention when using the hybrid NVM and DRAM memory.In this work,we explore to partition the large graphs under single compute node with hybrid NVM and DRAM memory.We propose a management strategy based on adjacent edge structure for dynamic cached data(AeFdy).This strategy converts the cached data structure from the adjacent vertex structure in the original streaming algorithms to the adjacent edge structure.The experimental results on 7 real-world graphs show that the average partitioning time of the new method is 4.9 times faster than that of original method.At the same time,the strategy evaluates the data pages in NVM and DRAM media by different models according to the characteristics of the streaming algorithm based on adjacent edge structure,places data pages in different memory media to reduce system migration operation times and improve the system performance.To demonstrate,the AeFdy strategy is simulated in the Linux kernel.Compared with the existing hybrid memory management strategy,such as Linux Swap,M-CLOCK and Dr.Swap,AeFdy improves the system performance by 128.5%,87.4%and 50.4%respectively.
作者
李琪
钟将
李雪
LI Qi;ZHONG Jiang;LI Xue(College of Computer Science,Chongqing University,Chongqing 400030;School of Information Technology and Electrical Engineering,University of Queensland,Brisbane,4072 Australia)
出处
《计算机学报》
EI
CSCD
北大核心
2019年第11期2481-2498,共18页
Chinese Journal of Computers
基金
国家重点研发计划(2017YFB1402400)
重庆市社会事业与民生保障科技创新专项(cstc2017shmsA0641)
国家“八六三”高技术研究发展计划项目基金(2015AA015308)
重庆市重点产业共性关键技术创新专项(cstc2017zdcy-zdyxx0047)
重庆市技术创新与应用示范(产业类重点研发)项目(cstc2018jszx-cyzdX0086)
中央高校项目(2018CDYJSY-0055)资助~~
关键词
复杂网络
非易失存储器
流划分
混合内存
内存计算
平衡图划分
complex network
non-volatile memory
streaming partitioning
hybrid memory
memory computing
balanced graph partitioning