摘要
基于强化学习的能量管理策略具有环境和驾驶员的自适应性,但是存在采样效率较低、开发阶段无法保证算法可靠性的问题。尝试将动态规划算法融入基于模型的强化学习框架中——实时收集工况数据和动力系统数据用于更新预测模型,并使用动态规划求解最优策略,通过决策树算法挖掘其中最优控制规则,用以迭代能量管理策略。一方面,动态规划对完整工况重新解算极大增强了样本效率,且具有较好的可靠性和可解释性。另一方面,通过数据采集和更新实现能量管理策略的对驾驶工况和系统老化的自适应性,能够有效应对极端工作环境、动力系统元器件老化等传统能量管理策略会失效的场景。实验结果表明,在未知驾驶工况下,可以达到全局最优解的92%以上的管理效果;而且可以有效通过类似工况改善能量管理策略,例如日常通勤场景。当动力系统参数发生改变时,通过策略迭代也可以有效地调整能量管理策略。
The energy management strategy based on reinforcement learning has the adaptability of environment and driver, but it has the problems of low sampling efficiency and no guarantee of algorithm reliability at the development stage. The dynamic programming algorithm is tried to be integrated into the model-based reinforcement learning framework the real-time collection of working condition data and power system data is used to update the prediction model, and the dynamic programming is used to solve the optimal strategy, and the decision tree algorithm is used to mine the optimal control rules among them to iterate the energy management strategy.On the one hand, the dynamic programming has greatly enhanced the sample efficiency by recalculating the complete working conditions, and has good reliability and interpretability. On the other hand, the energy management strategy can be adaptive to driving conditions and system aging through data acquisition and update, which can effectively deal with extreme working environments, aging of power system components and other scenarios where traditional energy management strategies will fail. The experimental results show that under the unknown driving conditions, the management effect of more than 92% of the global optimal solution can be achieved;Moreover, it can effectively improve energy management strategies through similar working conditions, such as daily commuting scenarios. When the parameters of the power system change, the energy management strategy can also be effectively adjusted through strategy iteration.
作者
罗来军
隋巧梅
郭楠鸿
Luo Laijun;Sui Qiaomei;Guo Nanhong(DIAS Automotive Electronic Systems Co.,Ltd.Shanghai 201206;Institute of Power Plant and Automation,Shanghai Jiao Tong University,Shanghai 200240)
出处
《传动技术》
2022年第3期3-11,共9页
Drive System Technique
基金
上汽基金项目(编号1722)。
关键词
动态规划
自适应
能量管理策略
插电式混合动力汽车
强化学习
dynamic programming
energy management strategy
reinforcement learning
plug-in hybrid electric vehicle