摘要
以x86+GPU为代表的当前主流AI计算平台,受限于功耗、体积、带宽、环境适应性等因素,无法适用于物端及边缘智能计算场景.提出并研究了一种基于ARM+DLP+SRIO的嵌入式智能计算系统,从AI算力、能效比、IO带宽三个方面分析了所提嵌入式智能计算系统的设计思路和技术优势,并实验验证了该系统的功能及性能指标.实验结果表明:基于ARM+DLP+SRIO的嵌入式智能计算系统AI峰值算力达到114.9TOPS,能效比达到1.03TFLOPS/W,IO带宽达到20Gbps.在智能计算系统领域,其能效比优于国内其它已知同类板卡或系统,嵌入式环境适应能力优于传统台式机和服务器,可作为物端及边缘环境下AI计算任务的通用硬件加速平台.
The existing artificial intelligent(AI)computing platform represented by x86+GPU,limited by power consumption,dimension,bandwidth,environmental adaptability,and other factors,cannot be well adapted to the things and edge intelligent computing scenarios.We proposed an embedded AI computing system based on ARM(Advanced RISC Machine)+DLP(Deep Learning Processor)+SRIO(Serial RapidIO),and elaborated the design methods and technical advantages.In study,three aspects of the system were dissertated:AI computing performance,power efficiency,and IO bandwidth,and the function and performance of the system were verified by experiments.The results show that the peak performance of the embedded AI computing system based on ARM+DLP+SRIO is up to 114.9TOPS,the energy efficiency is up to 1.03TFLOPS/W,and the IO bandwidth is up to 20Gbps.In the field of AI computing systems,its energy efficiency is better than other similar boards or systems in China,and its embedded environmental adaptability is better than that of traditional desktops and servers,so it can provide a general hardware acceleration platform for AI computing tasks in things and edge computing scenarios.
作者
赵二虎
吴济文
查晶晶
郭振
徐勇军
ZHAO Er-hu;WU Ji-wen;ZHA Jing-jing;GUO Zhen;XU Yong-jun(Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2021年第3期443-453,共11页
Acta Electronica Sinica
基金
“十三五”领域基金(No.61403120111)。