摘要
为降低存储墙以及传统的冯诺依曼瓶颈对计算系统高性能和低功耗设计带来的影响,提出一种基于存储计算的硬件加速架构。将排列规则的存储阵列转化为可重构的计算源,在保证原来存储功能的情况下,完成特定运算,实现存储和计算的双重功能;采用后台数据传输机制隐藏处理器和片外存储计算逻辑通信的延时,充分利用存储器的块状组织结构,以高带宽实现不同任务的并行计算,提高系统性能。实验结果表明,相对于传统的加速结构,采用该架构可以使系统以低于2%的硬件开销,提升至少2倍性能。
To address the problem brought by the memory wall and Von Neumann bottleneck features including high energy,high latency and low bandwidth,a hardware accelerator framework based on in-memory computing was proposed.Regular memory array was transformed into configurable computing resources to accelerate variety of tasks-both data and compute intensive,serving dual purpose of storage and computing.Background data transfer was utilized to hide the high communication latency between processors and off-chip mem-computing logic,and the bank-based memory was connected with lightweight processing elements,realizing parallel computing of different tasks with high bandwidth,the performance was improved greatly.Experimental results show that the system can improve the performance by two times at the cost of chip area overhead of 2%,relative to the traditional accelerator framework.
出处
《计算机工程与设计》
北大核心
2016年第4期1071-1075,共5页
Computer Engineering and Design
基金
国家自然科学基金项目(61103008)
上海市科委专项基金项目(12511503700)
三星公司-复旦大学合作基金项目(SLSI-201403DD013)
关键词
存储计算
处理器
后台数据传输
加速器
并行计算
in-memory computing
processor
background data transfer
accelerator
parallel computing