摘要
配电架空线路很多建设在环境恶劣、地形复杂的区域,当其上有冰雪、污物或者异物时,极易引发线路故障。考虑到电力维护人员的工作效率和安全,解决线路自动化清洗问题,提出一种基于深度强化学习的自动化清洗机器人路径规划方法。首先基于深度强化学习,结合碰撞约束与目标约束建立了清洗路径的规划模型,同时对到达目标位置和深度清洗关键部位予以智能体奖励。然后针对深度强化学习进行Actor-Critic优化,Actor依据概率采取相应的动作,引发的奖励反馈回Critic;Critic通过动作情况与反馈对比,来确定后续动作。最后,引入长短期网络对历史数据及不同维度的数据进行处理,采用强化学习中的A3C方法对路径进行选择和判别,采取多线程的强化学习方法,多个智能体并行完成路径规划并清洗。基于Gazebo平台进行仿真,结果表明,所提方法将清洗时间缩短至少15%,并显著提升了清洗机器人的运行效率。
Many overhead power distribution lines are built in areas with harsh environment and complex terrain.When there are ice,snow,dirt or foreign objects on them,it is very easy to cause line faults.Considering the work efficiency and safety of power maintenance personnel,a path planning method of automatic cleaning robot based on deep reinforcement learning is proposed to solve the problem of automatic cleaning of lines.Firstly,based on deep re⁃inforcement learning,combined with collision constraints and target constraints,a cleaning path planning model was established,and agents were rewarded for reaching the target location and the key parts of deep cleaning.Then,Ac⁃tor Critical optimization was carried out for deep reinforcement learning.The Actor takes corresponding actions ac⁃cording to probability,and the rewards caused by this action were fed back to the Critical;And the Critical deter⁃mined the follow-up actions by comparing the action situation with the feedback.Finally,long-term and short-term networks were introduced to process historical data and data of different dimensions.A3C method in reinforcement learning was used to select and judge the path,and multi-threaded reinforcement learning method was adopted.Mul⁃tiple agents completed the path planning and cleaning in parallel.Simulation experiments based on gazebo platform show that the proposed method can shorten the cleaning time by at least 15%and significantly improve the operation efficiency of the cleaning robot.
作者
王榆
陈凯
周云婷
WANG Yu;CHEN Kai;ZHOU Yun-ting(State Grid Fuzhou Power Supply Company,Fuzhou Fujian 350001,China;Fuzhou University,Fuzhou Fujian 350108,China)
出处
《计算机仿真》
北大核心
2023年第12期128-132,225,共6页
Computer Simulation
基金
国网福建省电力有限公司科技项目(5213102000FB)。
关键词
深度强化学习
路径规划
架空线路
多智能体
Deep reinforcement learning
Path planning
Overhead lines
Multi-Agent