期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Day-ahead scheduling based on reinforcement learning with hybrid action space
1
作者 CAO Jingyu DONG Lu SUN Changyin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2022年第3期693-705,共13页
Driven by the improvement of the smart grid,the active distribution network(ADN)has attracted much attention due to its characteristic of active management.By making full use of electricity price signals for optimal s... Driven by the improvement of the smart grid,the active distribution network(ADN)has attracted much attention due to its characteristic of active management.By making full use of electricity price signals for optimal scheduling,the total cost of the ADN can be reduced.However,the optimal dayahead scheduling problem is challenging since the future electricity price is unknown.Moreover,in ADN,some schedulable variables are continuous while some schedulable variables are discrete,which increases the difficulty of determining the optimal scheduling scheme.In this paper,the day-ahead scheduling problem of the ADN is formulated as a Markov decision process(MDP)with continuous-discrete hybrid action space.Then,an algorithm based on multi-agent hybrid reinforcement learning(HRL)is proposed to obtain the optimal scheduling scheme.The proposed algorithm adopts the structure of centralized training and decentralized execution,and different methods are applied to determine the selection policy of continuous scheduling variables and discrete scheduling variables.The simulation experiment results demonstrate the effectiveness of the algorithm. 展开更多
关键词 day-ahead scheduling active distribution network(ADN) reinforcement learning hybrid action space
下载PDF
Hybrid Q-learning for data-based optimal control of non-linear switching system
2
作者 LI Xiaofeng DONG Lu SUN Changyin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2022年第5期1186-1194,共9页
In this paper,the optimal control of non-linear switching system is investigated without knowing the system dynamics.First,the Hamilton-Jacobi-Bellman(HJB)equation is derived with the consideration of hybrid action sp... In this paper,the optimal control of non-linear switching system is investigated without knowing the system dynamics.First,the Hamilton-Jacobi-Bellman(HJB)equation is derived with the consideration of hybrid action space.Then,a novel data-based hybrid Q-learning(HQL)algorithm is proposed to find the optimal solution in an iterative manner.In addition,the theoretical analysis is provided to illustrate the convergence and optimality of the proposed algorithm.Finally,the algorithm is implemented with the actor-critic(AC)structure,and two linear-in-parameter neural networks are utilized to approximate the functions.Simulation results validate the effectiveness of the data-driven method. 展开更多
关键词 switching system hybrid action space optimal control reinforcement learning hybrid Q-learning(HQL)
下载PDF
Sim-to-Real: A Performance Comparison of PPO, TD3, and SAC Reinforcement Learning Algorithms for Quadruped Walking Gait Generation
3
作者 James W. Mock Suresh S. Muknahallipatna 《Journal of Intelligent Learning Systems and Applications》 2024年第2期23-43,共21页
The performance of the state-of-the-art Deep Reinforcement algorithms such as Proximal Policy Optimization, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic for generating a quadruped walking gai... The performance of the state-of-the-art Deep Reinforcement algorithms such as Proximal Policy Optimization, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic for generating a quadruped walking gait in a virtual environment was presented in previous research work titled “A Comparison of PPO, TD3, and SAC Reinforcement Algorithms for Quadruped Walking Gait Generation”. We demonstrated that the Soft Actor-Critic Reinforcement algorithm had the best performance generating the walking gait for a quadruped in certain instances of sensor configurations in the virtual environment. In this work, we present the performance analysis of the state-of-the-art Deep Reinforcement algorithms above for quadruped walking gait generation in a physical environment. The performance is determined in the physical environment by transfer learning augmented by real-time reinforcement learning for gait generation on a physical quadruped. The performance is analyzed on a quadruped equipped with a range of sensors such as position tracking using a stereo camera, contact sensing of each of the robot legs through force resistive sensors, and proprioceptive information of the robot body and legs using nine inertial measurement units. The performance comparison is presented using the metrics associated with the walking gait: average forward velocity (m/s), average forward velocity variance, average lateral velocity (m/s), average lateral velocity variance, and quaternion root mean square deviation. The strengths and weaknesses of each algorithm for the given task on the physical quadruped are discussed. 展开更多
关键词 Reinforcement Learning Reality Gap Position Tracking action spaces Domain Randomization
下载PDF
Mixed Deep Reinforcement Learning Considering Discrete-continuous Hybrid Action Space for Smart Home Energy Management 被引量:2
4
作者 Chao Huang Hongcai Zhang +2 位作者 Long Wang Xiong Luo Yonghua Song 《Journal of Modern Power Systems and Clean Energy》 SCIE EI CSCD 2022年第3期743-754,共12页
This paper develops deep reinforcement learning(DRL)algorithms for optimizing the operation of home energy system which consists of photovoltaic(PV)panels,battery energy storage system,and household appliances.Model-f... This paper develops deep reinforcement learning(DRL)algorithms for optimizing the operation of home energy system which consists of photovoltaic(PV)panels,battery energy storage system,and household appliances.Model-free DRL algorithms can efficiently handle the difficulty of energy system modeling and uncertainty of PV generation.However,discretecontinuous hybrid action space of the considered home energy system challenges existing DRL algorithms for either discrete actions or continuous actions.Thus,a mixed deep reinforcement learning(MDRL)algorithm is proposed,which integrates deep Q-learning(DQL)algorithm and deep deterministic policy gradient(DDPG)algorithm.The DQL algorithm deals with discrete actions,while the DDPG algorithm handles continuous actions.The MDRL algorithm learns optimal strategy by trialand-error interactions with the environment.However,unsafe actions,which violate system constraints,can give rise to great cost.To handle such problem,a safe-MDRL algorithm is further proposed.Simulation studies demonstrate that the proposed MDRL algorithm can efficiently handle the challenge from discrete-continuous hybrid action space for home energy management.The proposed MDRL algorithm reduces the operation cost while maintaining the human thermal comfort by comparing with benchmark algorithms on the test dataset.Moreover,the safe-MDRL algorithm greatly reduces the loss of thermal comfort in the learning stage by the proposed MDRL algorithm. 展开更多
关键词 Demand response deep reinforcement learning discrete-continuous action space home energy management safe reinforcement learning
原文传递
An enhanced eco-driving strategy based on reinforcement learning for connected electric vehicles:cooperative velocity and lane-changing control
5
作者 Haitao Ding Wei Li +1 位作者 Nan Xu Jianwei Zhang 《Journal of Intelligent and Connected Vehicles》 EI 2022年第3期316-332,共17页
Purpose–This study aims to propose an enhanced eco-driving strategy based on reinforcement learning(RL)to alleviate the mileage anxiety of electric vehicles(EVs)in the connected environment.Design/methodology/approac... Purpose–This study aims to propose an enhanced eco-driving strategy based on reinforcement learning(RL)to alleviate the mileage anxiety of electric vehicles(EVs)in the connected environment.Design/methodology/approach–In this paper,an enhanced eco-driving control strategy based on an advanced RL algorithm in hybrid action space(EEDC-HRL)is proposed for connected EVs.The EEDC-HRL simultaneously controls longitudinal velocity and lateral lane-changing maneuvers to achieve more potential eco-driving.Moreover,this study redesigns an all-purpose and efficient-training reward function with the aim to achieve energy-saving on the premise of ensuring other driving performance.Findings–To illustrate the performance for the EEDC-HRL,the controlled EV was trained and tested in various traffic flow states.The experimental results demonstrate that the proposed technique can effectively improve energy efficiency,without sacrificing travel efficiency,comfort,safety and lane-changing performance in different traffic flow states.Originality/value–In light of the aforementioned discussion,the contributions of this paper are two-fold.An enhanced eco-driving strategy based an advanced RL algorithm in hybrid action space(EEDC-HRL)is proposed to jointly optimize longitudinal velocity and lateral lane-changing for connected EVs.A full-scale reward function consisting of multiple sub-rewards with a safety control constraint is redesigned to achieve eco-driving while ensuring other driving performance. 展开更多
关键词 Ecological driving Electric vehicles Reinforcement learning in hybrid action space Velocity and lane-changing control Reward function
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部