摘要
为了在领域文本中实现数据定位,将文本视为环境,针对文本环境中存在的动态性以及不确定性等问题,提出了基于多agent分层强化学习的数据定位方法。该方法利用分层结构的特点,将系统任务分解为多个子任务,个体agent分别对对应子任务学习,以此将策略更新限制在规模较小的局部空间;同时利用多agent系统中单agent与系统远期目标的同一性,引入策略协调机制,通过agent之间交换信息来发现趋势性信息,并利用shaping技术,将在线获取的动态知识对各个agent进行趋势性启发,加快agent的收敛速度。实验将该方法应用于司法领域的判决文书上,实验结果表明:该方法能够在大规模复杂未知的文本环境中对目标数据进行高效准确定位,平均准确率与F值能够达到96.6%和98.2%,且具有较好的收敛速度。因此可以看出,该方法能够很好地在领域文本中实现数据定位,具有较大的理论以及实际意义。
In order to achieve data location in the domain text,this paper regarded the text as the environment.Aiming at the dynamic and uncertainty of the text environment,this paper proposed a data location method based on multi-agent hierarchical reinforcement learning.The method utilized the characteristics of the hierarchical structure to decompose the system tasks into multiple subtasks,and the individual agents respectively learnt the corresponding subtasks,thereby limiting the strategy update to the smaller local space.And simultaneously utilizing the multi-agent system the identity of a single agent with the system’s long-term goals,introduced a policy coordination mechanism,exchanged information between agents to discover trend information,and used the sharing technique to dynamically acquire online dynamic knowledge.The agent conducted trending inspiration and speeded up the convergence of the agent.It applied the method to the judgment documents in the judicial field,and the practical application results show that the proposed method can efficiently and accurately locate the target data in a large-scale complex and unknown text environment,and the average accuracy and F value can reach 96.6%and 98.2%,and has a good convergence speed.Therefore,this method can well realize data location in domain text,which has great theoretical and practical significance.
作者
洪壮壮
万仲保
张薇
黄兆华
Hong Zhuangzhuang;Wan Zhongbao;Zhang Wei;Huang Zhaohua(Dept.of Software Engineering,East China Jiaotong University,Nanchang 330013,China)
出处
《计算机应用研究》
CSCD
北大核心
2020年第12期3635-3639,共5页
Application Research of Computers
基金
国家重点研发计划项目(2018YFC0831106)
江西省自然科学基金资助项目(20122BAB201040)。