The performance of the state-of-the-art Deep Reinforcement algorithms such as Proximal Policy Optimization, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic for generating a quadruped walking gai...The performance of the state-of-the-art Deep Reinforcement algorithms such as Proximal Policy Optimization, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic for generating a quadruped walking gait in a virtual environment was presented in previous research work titled “A Comparison of PPO, TD3, and SAC Reinforcement Algorithms for Quadruped Walking Gait Generation”. We demonstrated that the Soft Actor-Critic Reinforcement algorithm had the best performance generating the walking gait for a quadruped in certain instances of sensor configurations in the virtual environment. In this work, we present the performance analysis of the state-of-the-art Deep Reinforcement algorithms above for quadruped walking gait generation in a physical environment. The performance is determined in the physical environment by transfer learning augmented by real-time reinforcement learning for gait generation on a physical quadruped. The performance is analyzed on a quadruped equipped with a range of sensors such as position tracking using a stereo camera, contact sensing of each of the robot legs through force resistive sensors, and proprioceptive information of the robot body and legs using nine inertial measurement units. The performance comparison is presented using the metrics associated with the walking gait: average forward velocity (m/s), average forward velocity variance, average lateral velocity (m/s), average lateral velocity variance, and quaternion root mean square deviation. The strengths and weaknesses of each algorithm for the given task on the physical quadruped are discussed.展开更多
In order to study the critical load position that causes cavities beneath the continuously reinforced concrete pavement( CRCP) slab under vehicle loading, the elliptical load is translated into the square load based...In order to study the critical load position that causes cavities beneath the continuously reinforced concrete pavement( CRCP) slab under vehicle loading, the elliptical load is translated into the square load based on the equivalence principle.The CRCP slab is analyzed to determine the cavity position beneath the slab under vehicle loading. The influences of cavity size on the CRCP slab's stress and vertical displacement are investigated. The study results showthat the formation of the cavity is unavoidable under traffic loading, and the cavity is located at the edge of the longitudinal crack and the slab corner.The cavity size exerts an obvious influence on the largest horizontal tensile stress and vertical displacement. The slab corner is the critical load position of the CRCP slab. The results can be used to assist the design of CRCP in avoiding cavities beneath slabs subject to vehicle loading.展开更多
文摘The performance of the state-of-the-art Deep Reinforcement algorithms such as Proximal Policy Optimization, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic for generating a quadruped walking gait in a virtual environment was presented in previous research work titled “A Comparison of PPO, TD3, and SAC Reinforcement Algorithms for Quadruped Walking Gait Generation”. We demonstrated that the Soft Actor-Critic Reinforcement algorithm had the best performance generating the walking gait for a quadruped in certain instances of sensor configurations in the virtual environment. In this work, we present the performance analysis of the state-of-the-art Deep Reinforcement algorithms above for quadruped walking gait generation in a physical environment. The performance is determined in the physical environment by transfer learning augmented by real-time reinforcement learning for gait generation on a physical quadruped. The performance is analyzed on a quadruped equipped with a range of sensors such as position tracking using a stereo camera, contact sensing of each of the robot legs through force resistive sensors, and proprioceptive information of the robot body and legs using nine inertial measurement units. The performance comparison is presented using the metrics associated with the walking gait: average forward velocity (m/s), average forward velocity variance, average lateral velocity (m/s), average lateral velocity variance, and quaternion root mean square deviation. The strengths and weaknesses of each algorithm for the given task on the physical quadruped are discussed.
基金The Science Foundation of Ministry of Transport of the People's Republic of China(No.200731822301-7)
文摘In order to study the critical load position that causes cavities beneath the continuously reinforced concrete pavement( CRCP) slab under vehicle loading, the elliptical load is translated into the square load based on the equivalence principle.The CRCP slab is analyzed to determine the cavity position beneath the slab under vehicle loading. The influences of cavity size on the CRCP slab's stress and vertical displacement are investigated. The study results showthat the formation of the cavity is unavoidable under traffic loading, and the cavity is located at the edge of the longitudinal crack and the slab corner.The cavity size exerts an obvious influence on the largest horizontal tensile stress and vertical displacement. The slab corner is the critical load position of the CRCP slab. The results can be used to assist the design of CRCP in avoiding cavities beneath slabs subject to vehicle loading.
文摘针对四旋翼无人机的悬停控制及轨迹跟踪问题,利用近端策略优化算法来控制四旋翼飞行器,通过强化学习训练神经网络,将状态直接映射到四个旋翼,是一种用于在未知动态参数和干扰下控制任何线性或非线性系统的技术。基于回报塑形技术(The reward shaping of RL),提出了一种新颖的奖励函数,相比传统的PID算法,可以使无人机飞行更迅速且平稳。实验表明,四旋翼无人机可以以高精度高平稳的性能从三维中的定点悬停及轨迹跟踪,精度高达97.2%;文中的位置控制器具有泛化性和鲁棒性。