摘要
针对数据中心基于图形处理器(GPU)平台的关系网络推理计算中存在的低效能问题,本文提出了一种基于软硬件协同加速的关系网络优化方法。该方法采用基于GPU提取的支持集特征池与现场可编程门阵列(FPGA)推理异构协同的方式处理关系网络的推理计算,在高效能计算的同时保持关系网络的推理计算与GPU平台一致的准确率。利用基于高级综合(HLS)优化浮点卷积神经网络的计算方式,提高关系网络的处理能效。利用多运算单元异构多核处理的方式,满足FPGA时序收敛的同时,提升FPGA片上吞吐能力。本文在FPGA平台上实现了关系网络推理运算单元,在Omniglot数据集上构建的加速器功耗为15.867W,相对于GPU加速比为1.4~17.2;在miniImageNet数据集上构建的加速器功耗为12.359W,相对于GPU加速比为1.5~3.4。本文方法与同类FPGA加速浮点卷积神经网络相比,达到了最优的计算效能。实验数据表明,该方法有效利用了软硬件协同计算以及FPGA可重构计算的优势,降低了软硬件协同开发的耦合度,在保持关系网络推理计算准确率的同时,提升了关系网络推理的计算效能。
Aiming at the problem of low efficiency in relation network inference computing based on graphics processing unit(GPU)platform,this paper proposes a relation network optimization method based on software and hardware co-acceleration.In this method,the inference calculation of the relation network is processed by means of heterogeneous collaboration between the feature pool of support set extracted by GPU and the inference of field programmable gate array(FPGA).The inference calculation of the relation network and the GPU platform are maintained with the same accuracy while the calculation is efficient.The processing energy efficiency of the relation network is improved by using the high-level synthesis(HLS)optimized floating point convolutional neural network.The heterogeneous multi-core processing method of multiple computing units is used to satisfy the convergence of FPGA timing sequence and improve the throughput capacity of FPGA chip.In this paper,a relation network inference operation unit is implemented on FPGA platform.The power consumption of the accelerator built on Omniglot dataset is 15.867W,and the acceleration ratio relative to GPU is 1.4~17.2.The power consumption of the accelerator built on the miniImagenet dataset is 12.359W,and the acceleration ratio relative to the GPU is 1.5~3.4.Compared with similar FPGA accelerated floating-point convolutional neural networks,the proposed method achieves the optimal computational performance.The experimental data show that this method effectively utilizes the advantages of software and hardware collaborative computing and FPGA reconfigurable computation,reduces the coupling degree of software and hardware collaborative development,and improves the computational efficiency of relation network inference while maintaining the accuracy of relation network inference calculation.
作者
张志超
王剑
章隆兵
肖俊华
ZHANG Zhichao;WANG Jian;ZHANG Longbing;XIAO Junhua(State Key Laboratory of Computer Architecture,Institute of Computer Technology,Chinese Academy of Sciences,Beijing 100190;Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049;The 15th Research Institute of China Electronics Technology Group Corporation,Beijing 100083)
出处
《高技术通讯》
CAS
2022年第4期327-336,共10页
Chinese High Technology Letters
基金
国家自然科学基金(61432016)
国家重点研发计划(2018YFC0832306,2018YFC0831203,2018YFC0831206)资助项目。
关键词
关系网络
软硬件协同加速
卷积神经网络
异构多核
relation network
software and hardware co-acceleration
convolutional neural network
heterogeneous multi-core