This paper studied a supervisory control system for a hybrid off-highway electric vehicle under the chargesustaining(CS)condition.A new predictive double Q-learning with backup models(PDQL)scheme is proposed to optimi...This paper studied a supervisory control system for a hybrid off-highway electric vehicle under the chargesustaining(CS)condition.A new predictive double Q-learning with backup models(PDQL)scheme is proposed to optimize the engine fuel in real-world driving and improve energy efficiency with a faster and more robust learning process.Unlike the existing“model-free”methods,which solely follow on-policy and off-policy to update knowledge bases(Q-tables),the PDQL is developed with the capability to merge both on-policy and off-policy learning by introducing a backup model(Q-table).Experimental evaluations are conducted based on software-in-the-loop(SiL)and hardware-in-the-loop(HiL)test platforms based on real-time modelling of the studied vehicle.Compared to the standard double Q-learning(SDQL),the PDQL only needs half of the learning iterations to achieve better energy efficiency than the SDQL at the end learning process.In the SiL under 35 rounds of learning,the results show that the PDQL can improve the vehicle energy efficiency by 1.75%higher than SDQL.By implementing the PDQL in HiL under four predefined real-world conditions,the PDQL can robustly save more than 5.03%energy than the SDQL scheme.展开更多
基金Project(KF2029)supported by the State Key Laboratory of Automotive Safety and Energy(Tsinghua University),ChinaProject(102253)supported partially by the Innovate UK。
文摘This paper studied a supervisory control system for a hybrid off-highway electric vehicle under the chargesustaining(CS)condition.A new predictive double Q-learning with backup models(PDQL)scheme is proposed to optimize the engine fuel in real-world driving and improve energy efficiency with a faster and more robust learning process.Unlike the existing“model-free”methods,which solely follow on-policy and off-policy to update knowledge bases(Q-tables),the PDQL is developed with the capability to merge both on-policy and off-policy learning by introducing a backup model(Q-table).Experimental evaluations are conducted based on software-in-the-loop(SiL)and hardware-in-the-loop(HiL)test platforms based on real-time modelling of the studied vehicle.Compared to the standard double Q-learning(SDQL),the PDQL only needs half of the learning iterations to achieve better energy efficiency than the SDQL at the end learning process.In the SiL under 35 rounds of learning,the results show that the PDQL can improve the vehicle energy efficiency by 1.75%higher than SDQL.By implementing the PDQL in HiL under four predefined real-world conditions,the PDQL can robustly save more than 5.03%energy than the SDQL scheme.