期刊文献+

异构平台下卷积神经网络推理模型自适应划分和调度方法 被引量:2

Adaptive partitioning and scheduling method of convolutional neural network inference model on heterogeneous platforms
下载PDF
导出
摘要 针对卷积神经网络(CNN)在异构平台执行推理时存在硬件资源利用率低、延迟高等问题,提出一种CNN推理模型自适应划分和调度方法。首先,通过遍历计算图提取CNN的关键算子完成模型的自适应划分,增强调度策略灵活性;然后,基于性能实测与关键路径-贪婪搜索算法,在CPU-GPU异构平台上根据子模型运行特征选取最优运行负载,提高子模型推理速度;最后利用张量虚拟机(TVM)中跨设备调度机制,配置子模型的依赖关系与运行负载,实现模型推理的自适应调度,降低设备间通信延迟。实验结果表明,与TVM算子优化方法在GPU和CPU上的推理速度相比,所提方法在模型推理准确度无损前提下,推理速度提升了5.88%~19.05%和45.45%~311.46%。 Aiming at the problems of low hardware resource utilization and high latency of Convolutional Neural Network(CNN)when performing inference on heterogeneous platforms,a self-adaptive partitioning and scheduling method of CNN inference model was proposed.Firstly,the key operators of CNN were extracted by traversing the computational graph to complete the adaptive partition of the model,so as to enhance the flexibility of the scheduling strategy.Then,based on the performance measurement and the critical path-greedy search algorithm,according to the sub-model running characteristics on the CPU-GPU heterogeneous platform,the optimal running load was selected to improve the sub-model inference speed.Finally,the cross-device scheduling mechanism in TVM(Tensor Virtual Machine)was used to configure the dependencies and running loads of sub-models in order to achieve adaptive scheduling of model inference,and reduce the communication delay between devices.Experimental results show that on GPU and CPU,compared to the method optimized by TVM operator,the proposed method improves the inference speed by 5.88%to 19.05%and 45.45%to 311.46%with no loss of model inference accuracy.
作者 尚绍法 蒋林 李远成 朱筠 SHANG Shaofa;JIANG Lin;LI Yuancheng;ZHU Yun(College of Computer Science and Technology,Xi’an University of Science and Technology,Xi’an Shaanxi 710600,China;School of Electronic Engineering,Xi’an University of Posts and Telecommunications,Xi’an Shaanxi 710121,China)
出处 《计算机应用》 CSCD 北大核心 2023年第9期2828-2835,共8页 journal of Computer Applications
基金 国家自然科学基金资助项目(61834005) 陕西省自然科学基金资助项目(2020JM-525) 科技创新2030-“新一代人工智能”重大项目(2020AAA0104603) 榆林市科技计划项目(CXY-2020-026)。
关键词 张量虚拟机 卷积神经网络 模型划分 任务调度 特征分析 Tensor Virtual Machine(TVM) Convolutional Neural Network(CNN) model partitioning task scheduling characteristic analysis
  • 相关文献

参考文献5

二级参考文献13

共引文献49

同被引文献34

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部