期刊文献+

面向图计算应用的处理器访存通路优化设计与实现

Design and implementation of a novel off-chip memory access path for graph computing
下载PDF
导出
摘要 针对图计算应用的访存特点,提出并实现一种支持高并发、乱序和异步访存的高并发访存模块(High Concurrency and high Performance Fetcher,HCPF)。通过软-硬件协同的设计方法,HCPF可同时处理192条共8种类型的内存访问请求,且访存粒度可由用户定义,满足图计算应用对海量低延迟细粒度数据访问的需求。同时,HCPF扩展了基于内存语义的跨计算节点定制互连技术,支持远程内存的细粒度直接访问,为后续实现分布式图计算框架提供技术基础。结合上述两个核心研究内容,基于流水线RISC-V处理器核,设计并实现了可支持HCPF的RISC-V片上系统(System-on-Chip,SoC)架构,搭建基于FPGA的原型验证平台,并使用自研测试程序对HCPF进行初步性能评测。实验结果表明,HCPF相比原有访存通路,最高可将基于数组和随机地址的两种随机内存访问性能分别提升至3.5倍和2.7倍。远程内存直接访问4 Byte数据的延时仅为1.63μs。 A novel asynchronous memory access path,which supports highly concurrent and out-of-order off-chip memory requests was proposed.In order to satisfy the requirements of graph applications,a software-defined interface in our proposed memory access path to handle hundreds of kinds of off-chip memory requests with arbitrary granularity via hardware-software co-design methodology was implemented.A custom memory semantic interconnect was designed for fine-grained remote memory access among various computing nodes leveraged in future distributed graph processing scenarios.Last but not least,we integrate our proposed novel memory access path into a RISC-V instruction set architecture-based SoC(system-on-chip)architecture and implement an FPGA prototype.Based on our custom random access microbenchmarks,preliminary evaluation results show that performance of array-based and random address-based off-chip memory access is improved by 3.5x and 2.7x respectively using our proposed asynchronous memory access path,and accessing 4 bytes data from remote memory only takes 1.63μs.
作者 张旭 常轶松 张科 陈明宇 ZHANG Xu;CHANG Yisong;ZHANG Ke;CHEN Mingyu(Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China;Peng Cheng Laboratory,Shenzhen 518000,China)
出处 《国防科技大学学报》 EI CAS CSCD 北大核心 2020年第2期13-22,共10页 Journal of National University of Defense Technology
基金 国家重点研发计划资助项目(2017YFB1001602) 国家自然科学基金资助项目(61702485) 中国科学院青年创新促进会资助项目(2017143)。
关键词 内存级并行 访存通路 图计算应用 memory-level parallelism memory access path graph computing
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部