摘要
随着深度学习神经网络的发展,FPGA上的神经网络开发获得了广泛关注.本文利用Intel FPGA提供的OpenCL SDK,在FPGA板卡上设计并实现了完整的全连接神经网络的前向模型,并针对基准系统中的存储瓶颈,通过分组划分、数据复用、优化激活函数、单指令多数据流、浮点数半精化等策略进行优化,平衡了系统中的资源占用情况,扩大了电路规模,提升了系统性能;优化后的版本与基准版本相比,得到了2. 19x的加速.优化后,系统的主频达到380MHz,RAM占用率达到94%,DSP占用率达到42%.
With the widely used of deep learning in different areas,implementing neural network on FPGA draws experts’ attention.This paper implements a fully connected neural network with OpenCL on FPGA,and optimizes the system with some methods,such as dividing computing groups,reusing data,optimizing activity function,implementing single instrument multiply data and half precision floating point format. These strategies enlarge the scale and improve the performance of system. Finally we achieved 2. 19 x speed up under the 380 MHz system frequency with 92% usage of RAMs and 42% usage of DSP blocks.
作者
周鑫
安虹
迟孟贤
金旭
韩文廷
ZHOU Xin;AN Hong;CHI Meng-xian;JIN Xu;HAN Wen-ting(School of Computer Science and Technology,University of Science and Technology of China,Hefei 230021,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2019年第2期348-352,共5页
Journal of Chinese Computer Systems
基金
国家重点研发计划项目(2016YFB1000403)资助