摘要
FPGA(现场可编程门阵列)作为人工智能应用的新加速载体,可替GPU对人工智能应用推理阶段进行加速。文中提出了一种新的人工智能应用加速方案,利用定点、矩阵压缩等方法对卷积神经网络(CNN)模型进行处理,优化CNN网络模型,并设计开发一套驱动软件框架以适配国产平台。该技术在飞腾1500A国产服务器上对卷积神经网络中的人脸识别与目标检测应用进行加速,运算性能较目前国产服务器运算能力提升30倍以上,实现自主可控的人工智能应用加速。
As an innovative accelerator in application of artificial intelligence,FPGA is supposed to replace GPU to accelerate artificial intelligence in inference link.In this paper,a new scheme to accelerate CNN based applications is put forward,which uses quantization and matrix-compression to optimize CNN model,and develop a software framework to fit domestic server.Compared with latest domestic server,the running time which uses FPGA to accelerate face recognition application and target detection application base on CNN is 30 times faster than domestic server.Thus it can study out an independently controllable computing platform for artificial intelligence application.
作者
丁立德
胡怀湘
DING Li-de;HU Huai-xiang(North China Institute of Computing Technology,Beijing 100083,China)
出处
《信息技术》
2019年第12期110-115,共6页
Information Technology