期刊文献+

基于多动作并行异步深度确定性策略梯度的选矿运行指标决策方法 被引量:1

Multi-action parallel asynchronous depth deterministic strategy gradient based decision-making approach of operational indices for mineral processing
原文传递
导出
摘要 为了解决深度确定性策略梯度算法探索能力不足的问题,提出一种多动作并行异步深度确定性策略梯度(MPADDPG)算法,并用于选矿运行指标强化学习决策.该算法使用多个actor网络,进行不同的初始化和训练,不同程度地提升了探索能力,同时通过扩展具有确定性策略梯度结构的评论家体系,揭示了探索与利用之间的关系.该算法使用多个DDPG代替单一DDPG,可以减轻一个DDPG性能不佳的影响,提高学习稳定性;同时通过使用并行异步结构,提高数据利用效率,加快了网络收敛速度;最后,actor通过影响critic的更新而得到更好的策略梯度.通过选矿过程运行指标决策的实验结果验证了所提出算法的有效性. In order to solve the problem of insufficient exploration ability of the deep deterministic strategy gradient algorithm,a multi-action parallel asynchronous deep deterministic policy gradient(DDPG) algorithm is proposed for the decision-making approach of operational indices in mineral processing based on reinforcement learning.This algorithm uses multiple actor networks for different initialization and training,which greatly increases the exploration to different degrees.The relationship between exploration and utilization is revealed by extending the critical architecture of deterministic selection policy.This algorithm uses multiple DDPGs instead of a single DDPG,which can alleviate the poor performance of one DDPG and improve the learning stability.And it also improves the data utilization efficiency and speeds up the network convergence by using parallel asynchronous structure.Finally,the actor gets better strategy gradient by influencing critic’s update.The effectiveness of the proposed approach has been verified by experiment results on decision-making of the operational indices in mineral processing.
作者 李悄然 丁进良 LI Qiao-ran;DING Jin-liang(State Key Laboratory of Synthetical Automation for Process Industries,Northeastern University,Shenyang 110004,China)
出处 《控制与决策》 EI CSCD 北大核心 2022年第8期1989-1996,共8页 Control and Decision
基金 国家重点研发计划课题(2018YFB1701104) 辽宁省科技技术项目(2020JH1/10100008)。
关键词 选矿 运行指标 决策 多动作 并行异步 深度确定性策略梯度 mineral processing operational indices decision-making multi-actions parallel asynchronous deep deterministic policy gradient
  • 相关文献

参考文献1

二级参考文献17

共引文献57

同被引文献10

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部