摘要
针对航空航天领域中智能计算单元运行环境恶劣、智能算法推理速度要求高以及模型部署过程复杂的问题,设计并实现了一种基于国产现场可编程门阵列(FPGA)的智能加速模块。该加速模块接口符合高速串行总线标准(VPX)规范,具有较好的机械结构和环境适应性,支持深度学习目标检测等算法的推理加速。在FPGA芯片、DDR内存、电源转换模块等核心元器件的硬件选型符合国产化要求的基础上,完成硬件电路设计,得到智能加速模块实物。以目标检测算法为例,使用设计的自动编译工具将权重文件和模型文件部署到智能加速模块上进行推理计算,实验证明了智能加速模块具有较好的环境适应性、部署便捷性以及计算加速性能,且加速比约为国产中央处理器(CPU)的4.47倍。
We designed and implemented an intelligent acceleration module based on domestic FPGA to address the issues of poor operating environment,high requirements for intelligent algorithm inference speed,and complex algorithm deployment process in aerospace field intelligent computing units.The module interface complies with the VPX standard specifications,has better mechanical structure and environmental adaptability,and supports inference acceleration of deep learning object detection algorithms.Based on the hardware selection of core components such as FPGA chips,DDR memory,and power conversion modules that meet the requirements of localization,the hardware circuit design is completed to obtain the actual intelligent acceleration module.Taking target detection algorithm as an example,the module deployed weight files and model files onto the intelligent acceleration module for inference calculation using the designed automatic compilation tool.The experiment proved that the intelligent acceleration module has great environmental adaptability,deployment convenience,and good computational acceleration performance,and the acceleration ratio is about 4.47 times than that of domestic CPUs.
作者
叶亚峰
张宁
寇金桥
王昕
YE Ya-feng;ZHANG Ning;KOU Jin-qiao;WANG Xin(Institute 706,Second Academy of China Aerospace Science and Industry Corporation,Beijing 100854,China)
出处
《计算机技术与发展》
2024年第10期8-15,共8页
Computer Technology and Development
关键词
智能计算单元
现场可编程门阵列
目标检测
智能加速模块
深度学习处理单元
自动编译工具
intelligent computing unit
field programmable gate array
object detection
intelligent acceleration module
deep learning processing unit
automatic compilation tool