期刊文献+

基于Spark的分布式机器人强化学习训练框架 被引量:1

Training Framework of Distributed Robot Reinforcement Learning Based on Spark
下载PDF
导出
摘要 强化学习能够通过自主学习的方式对机器人难以利用控制方法实现的各种任务进行训练完成,有效避免了系统设计人员对系统建模或制定规则。然而,强化学习在机器人开发应用领域中训练成本高昂,需要花费大量时间成本、硬件成本实现学习训练,虽然基于仿真可以一定程度减少硬件成本,但对类似Gazebo这样的复杂机器人训练平台,仿真过程工作效率低,数据采样耗时长。为了有效解决这些问题,针对机器人仿真过程的平台易用性、兼容性等方面进行优化,提出一种基于Spark的分布式强化学习框架,为强化学习的训练与机器人仿真采样提供分布式支持,具有高兼容性、健壮性的特性。通过实验数据分析对比,表明本系统框架不仅可有效提高机器人的强化学习模型训练速度,缩短训练时间花费,且有助于节约硬件成本。 Through autonomous learning, reinforcement learning can train robots to complete various tasks that are difficult for them to implement with control methods, and this can effectively avoid system designers from systemic modeling or rules making. However, the training cost of reinforcement learning in the field of robot development and application is high, and it takes a large amount of time cost and hardware cost to realize learning and training. Although the hardware cost can be reduced to some extent based on simulation, for the complicated robot training platform such as Gazebo, the working efficiency of simulation process is low, and it takes a long time for data sampling. In order to effectively solve these problems, a distributed reinforcement learning framework based on Spark is put forward, which optimizes the usability and compatibility of platform of robot simulation process, offers distributed support for the training of reinforcement learning and robot simulation sampling, and has the characteristics of high compatibility and robustness. Through analyzing and contrasting the experimental data, the system framework can not only effectively improve the training speed of reinforcement learning model of robot and shorten the training time, but also help with the saving of hardware cost.
作者 方伟 黄增强 徐建斌 黄羿 马新强 FANG Wei;HUANG Zeng-qiang;XU Jian-bin;HUANG Yi;MA Xin-qiang(Institute of Cyber Systems and Control,Zhejiang University,Hangzhou Zhejiang 310027,China;Department of Computer Science and Technology,Huaibei Vocational and Technical College,Huaibei Anhui 235000,China;School of Computer Science,Hangzhou Dianzi University,Hangzhou Zhejiang 310018,China;Materials Branch,State Grid Zhejiang Electric Power Company,LTD,Hangzhou Zhejiang 310000,China;Institute of Intelligent Computing and Visualization Based on Big Data,Chongqing University of Arts and Sciences,Chongqing 402160,China)
出处 《图学学报》 CSCD 北大核心 2019年第5期852-857,共6页 Journal of Graphics
基金 浙江大学工业控制技术国家重点实验室开放课题项目(ICT1800413) 重庆市发改委重大产业技术研发项目(2018148208) 重庆市教委科技项目(KJ1601129) 安徽省高校自然科学研究重点项目(KJ2018A0713) 安徽高校优秀青年骨干人才国内访问研修项目(gxgnfx2018108) 广东省重点领域研发计划项目(2019B010120001)
关键词 机器人 强化学习 SPARK 分布式 数据管道 robot reinforcement learning Spark distribute data pipeline
  • 相关文献

参考文献8

二级参考文献50

  • 1杨微,刘纪平,王勇.基于Heatmap的地理对象空间分布热度计算方法[J].测绘通报,2012(S1):391-393. 被引量:10
  • 2刘海宝,蔡皖东,许俊杰,王黎.分布式网络行为监控系统设计与实现[J].微电子学与计算机,2006,23(3):76-79. 被引量:10
  • 3夏俊鸾,邵赛赛.Spark Streaming: 大规模流式数据处理的新贵. http://www.csdn.net/article/2014-01-28/2818282-Spark -Streaming-big-data. 2014.
  • 4Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. Communications of the ACM, 2008, 3(51-1): 107-113.
  • 5耿益锋,陈冠诚.Impala:新一代开源大数据分析引擎. http://www.csdn.net/article/2013-12-04/2817707-ImpalaBig- Data-Engine. 2013.12.
  • 6Strom. http://storm.incubator.apache.org/. 2014.
  • 7Zaharia M, Chowdhury M, Das T, et al. Resilient distributed datasets: A fault-tolerant abstration for in-memory cluster computing. Proc. of the 9th USENIX Conference on NetWorked System Design and Implementation. 2012. 2-16.
  • 8Gonzalez J, Low Y, Gu H. PowerGraph: Distributed garph-p arallel computation on natural graphs. Proc. of the 10th USENIX Symposium on Operating Systems Design and Implementatin. 2012. 17-30.
  • 9Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I. Spark: Cluster Computing with Working Sets. Technical Report No. UCB/ EECS- 2010-53May 7, 2010.
  • 10Xin R, Rosen J, et al. Shark: SQL and Rich Analytics at Scale. Technical Report UCB/EECS. 2012.11.

共引文献95

同被引文献4

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部