By leveraging the data sample diversity,the early-exit network recently emerges as a prominent neural network architecture to accelerate the deep learning inference process.However,intermediate classifiers of the earl...By leveraging the data sample diversity,the early-exit network recently emerges as a prominent neural network architecture to accelerate the deep learning inference process.However,intermediate classifiers of the early exits introduce additional computation overhead,which is unfavorable for resource-constrained edge artificial intelligence(AI).In this paper,we propose an early exit prediction mechanism to reduce the on-device computation overhead in a device-edge co-inference system supported by early-exit networks.Specifically,we design a low-complexity module,namely the exit predictor,to guide some distinctly“hard”samples to bypass the computation of the early exits.Besides,considering the varying communication bandwidth,we extend the early exit prediction mechanism for latency-aware edge inference,which adapts the prediction thresholds of the exit predictor and the confidence thresholds of the early-exit network via a few simple regression models.Extensive experiment results demonstrate the effectiveness of the exit predictor in achieving a better tradeoff between accuracy and on-device computation overhead for early-exit networks.Besides,compared with the baseline methods,the proposed method for latency-aware edge inference attains higher inference accuracy under different bandwidth conditions.展开更多
基金fund of the Hong Kong Polytechnic University(P0038174)。
文摘By leveraging the data sample diversity,the early-exit network recently emerges as a prominent neural network architecture to accelerate the deep learning inference process.However,intermediate classifiers of the early exits introduce additional computation overhead,which is unfavorable for resource-constrained edge artificial intelligence(AI).In this paper,we propose an early exit prediction mechanism to reduce the on-device computation overhead in a device-edge co-inference system supported by early-exit networks.Specifically,we design a low-complexity module,namely the exit predictor,to guide some distinctly“hard”samples to bypass the computation of the early exits.Besides,considering the varying communication bandwidth,we extend the early exit prediction mechanism for latency-aware edge inference,which adapts the prediction thresholds of the exit predictor and the confidence thresholds of the early-exit network via a few simple regression models.Extensive experiment results demonstrate the effectiveness of the exit predictor in achieving a better tradeoff between accuracy and on-device computation overhead for early-exit networks.Besides,compared with the baseline methods,the proposed method for latency-aware edge inference attains higher inference accuracy under different bandwidth conditions.