摘要
针对存内计算大规模神经网络部署导致的计算延迟、运行功耗较大等问题,提出了基于深度强化学习的神经网络部署优化算法。首先,建立了马尔可夫决策过程的任务模型,优化神经网络的延迟和功耗,完成片上计算核心的部署。其次,针对优化部署过程中,存在求解空间过大、探索能力不足等问题,提出了一种基于深度强化学习的智能部署优化算法,从而得到近似最优的神经网络部署策略。最后,针对强化学习探索能力不足的问题,提出了一种基于内在激励的奖励策略,鼓励探索未知解空间,提高部署质量,解决陷入局部最优等问题。实验结果表明,该算法与目前强化学习算法相比能进一步优化功耗和延迟。
To address the issues of computational latency and high operational power consumption caused by the deployment of large-scale neural networks for in-memory computing,this paper proposed a deep reinforcement learning-based optimization algorithm for neural network deployment.Firstly,it established a task model for Markov decision processes,which optimized the latency and power consumption of the neural network and completed the deployment of the on-chip computing core.Secondly,to tackle the challenges of excessive solution space and insufficient exploration capability during the optimization process,it introduced a deployment optimization algorithm based on deep reinforcement learning to obtain a near-optimal neural network deployment strategy.Lastly,it proposed a reward strategy grounded in intrinsic motivation to address the lack of exploration abi-lity in reinforcement learning,encouraging the exploration of unknown solution spaces,enhancing the quality of deployment,and resolving issues such as getting trapped in local optimality.Experimental results demonstrate that the proposed algorithm further optimizes power consumption and latency compared to current reinforcement learning algorithms.
作者
胡益笛
夏银水
Hu Yidi;Xia Yinshui(Faculty of Electrical Engineering&Computer Science,Ningbo University,Ningbo Zhejiang 315211,China)
出处
《计算机应用研究》
CSCD
北大核心
2023年第9期2616-2620,共5页
Application Research of Computers
基金
国家自然科学基金资助项目(62131010,U22A2013)
浙江省创新群体资助项目(LDT23F4021F04)
宁波高新区重大技术创新资助项目(2022BCX050001)。
关键词
存内计算
深度强化学习
神经网络部署
近端策略优化
内在激励
processing in memory
deep reinforce learning
neural network deployment
proximal policy optimization
intrinsic reward