摘要
通过可学习的预测算法获取卷积神经网络(CNN)在硬件上的推理耗时越来越受到研究者的关注。现有耗时预测算法主要面临2个问题:卷积神经网络设计空间采样复杂度高,数据采集成本高;无法准确预测硬件编译器的算子融合技术对推理耗时的影响。为了解决上述问题,提出了一种基于图卷积网络(GCN)的耗时预测算法,将整体网络耗时看作多节点耗时补偿的累加,并利用图卷积对结构算子融合产生的耗时影响进行建模。同时,提出一种新型差分训练方案,减少采样空间规模,提高算法的泛化能力。在HISI3559硬件平台上对MB-C连续空间采样模型的耗时预测实验表明:所提算法可将耗时估计的平均相对误差从传统算法的302%降低到5.3%。另外,通过将传统耗时预测算法替换成所提算法进行耗时评估,可以使网络结构搜索算法搜索到耗时更加接近目标的高精度网络。
Obtaining the inference latency of a convolution neural network(CNN)via learnable prediction algorithm have attracted more attention.Existing latency predictors suffer from two major problems.First,the high complexity of CNN design space requires tremendous cost of data collection.Second,traditional algorithms fail to accurately model the effect of the hardware complier’s operator fusion on latency.To solve these problems,this paper proposes a latency predictor based on graph convolution network(GCN).This algorithm regards the latency of a complete network as accumulation of multi-node latency compensation,and utilizes graph convolution to model the effect caused by operator fusion.Furthermore,we propose a differential training algorithm to reduce the size of input space and improve the generalization of the algorithm.Experiments on HISI3559 in MB-C continuous search space show that our algorithm can reduce the average relative error from 302%to 5.3%.In addition,replacing the traditional latency predictor with the proposed predictor enables neural architecture search algorithms to find high precision networks with latency closer to the target.
作者
李哲暘
张如意
谭文明
任烨
雷鸣
吴昊
LI Zheyang;ZHANG Ruyi;TAN Wenming;REN Ye;LEI Ming;WU Hao(Hangzhou Hikvision Digital Technology Co.,Ltd.,Hangzhou 310051,China;Hangzhou Hikvision System Technology Co.,Ltd.,Hangzhou 310051,China)
出处
《北京航空航天大学学报》
EI
CAS
CSCD
北大核心
2022年第12期2450-2459,共10页
Journal of Beijing University of Aeronautics and Astronautics
基金
国家重点研发计划(2018YFC0807706)。
关键词
耗时预测
图卷积网络
深度学习
网络结构搜索
模型部署
latency prediction
graph convolution network
deep learning
neural architecture search
model deployment