期刊文献+

多agent分层强化学习在数据定位中的应用研究 被引量:1

Application research of multi-agent layered reinforcement learning in data location
下载PDF
导出
摘要 为了在领域文本中实现数据定位,将文本视为环境,针对文本环境中存在的动态性以及不确定性等问题,提出了基于多agent分层强化学习的数据定位方法。该方法利用分层结构的特点,将系统任务分解为多个子任务,个体agent分别对对应子任务学习,以此将策略更新限制在规模较小的局部空间;同时利用多agent系统中单agent与系统远期目标的同一性,引入策略协调机制,通过agent之间交换信息来发现趋势性信息,并利用shaping技术,将在线获取的动态知识对各个agent进行趋势性启发,加快agent的收敛速度。实验将该方法应用于司法领域的判决文书上,实验结果表明:该方法能够在大规模复杂未知的文本环境中对目标数据进行高效准确定位,平均准确率与F值能够达到96.6%和98.2%,且具有较好的收敛速度。因此可以看出,该方法能够很好地在领域文本中实现数据定位,具有较大的理论以及实际意义。 In order to achieve data location in the domain text,this paper regarded the text as the environment.Aiming at the dynamic and uncertainty of the text environment,this paper proposed a data location method based on multi-agent hierarchical reinforcement learning.The method utilized the characteristics of the hierarchical structure to decompose the system tasks into multiple subtasks,and the individual agents respectively learnt the corresponding subtasks,thereby limiting the strategy update to the smaller local space.And simultaneously utilizing the multi-agent system the identity of a single agent with the system’s long-term goals,introduced a policy coordination mechanism,exchanged information between agents to discover trend information,and used the sharing technique to dynamically acquire online dynamic knowledge.The agent conducted trending inspiration and speeded up the convergence of the agent.It applied the method to the judgment documents in the judicial field,and the practical application results show that the proposed method can efficiently and accurately locate the target data in a large-scale complex and unknown text environment,and the average accuracy and F value can reach 96.6%and 98.2%,and has a good convergence speed.Therefore,this method can well realize data location in domain text,which has great theoretical and practical significance.
作者 洪壮壮 万仲保 张薇 黄兆华 Hong Zhuangzhuang;Wan Zhongbao;Zhang Wei;Huang Zhaohua(Dept.of Software Engineering,East China Jiaotong University,Nanchang 330013,China)
出处 《计算机应用研究》 CSCD 北大核心 2020年第12期3635-3639,共5页 Application Research of Computers
基金 国家重点研发计划项目(2018YFC0831106) 江西省自然科学基金资助项目(20122BAB201040)。
关键词 数据定位 文本环境 分层强化学习 多AGENT系统 策略协调 shaping技术 data location text environment hierarchical reinforcement learning multi-agent system policy coordination shaping technology
  • 相关文献

参考文献3

二级参考文献28

  • 1Sutton R S, Barto A G. Introduction to Reinforcement Learning [M]. Cambridge: MIT Press, 1998.
  • 2Liu C, Xu X, Hu D. Multiobjeetive reinforcement learning: A comprehensive overview [J]. IEEE Trans on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2013, 99(4): 1-13.
  • 3Sutton R S, Precup D, Singh S P. Between MDPs and semi MDPs : A framework for temporal abstraction in reinforcement learning [J]. Artificial Intelligence, 1999, 112 (1) : 181-211.
  • 4Parr R. Hierachieal control and learning for markov decision processes [D]. Berkeley: University of Californiac at Berkeley, 1998.
  • 5Hengst B. Discovering hierarchical reinforcement learning [D]. Sydney: University of New South Wales, 2003.
  • 6Dietterich T G. Hierarchical reinforcement learning with the MAXQ value function decomposition [J]. Journal of Artificial Intelligence Research, 2000, 13(1): 227-303.
  • 7Hwang K S, Lin H Y, Hsu Y P, et al. Self-organizing state aggregation for architecture design of Q-learning [J]. Information Sciences, 2011, 181(13) : 2813-2822.
  • 8Ng A Y, Harada D, Russell S. Policy invariance under reward transformations: theory and application to reward shaping [C] //Proc of the 16th Int Conf on Machine Learning. San Francisco: Morgan Kaufmann, 1999= 278-287.
  • 9Bianchi R A C, Ribeiro C H C, Costa A H R. Accelerating autonomous learning by using heuristic selection of actions [J]. Journal of Heuristics, 2008, 14(2): 135-168.
  • 10Busoniu L, Babusta R, Schutter B D, et al. Reinforcement Learning and Dynamic Programming Using Function Approximators [M]. New York= Chemical Rubber Company (CRC) Press, 2010.

共引文献318

同被引文献15

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部