摘要
与常规的单跳机器阅读理解相比,多跳机器阅读理解(MHMRC)需要在多个相关文档段落中进行多跳推理以实现对复杂问题的理解和回答,其更接近于人类的语言推理能力,具有广阔的应用前景但也极具挑战性。阐述MHMRC的研究背景,将现有方法根据适用场景分为封闭集合问答和开放域问答两类,主要包括基于问题分解的方法、基于图神经网络的方法、改进检索的方法、基于推理路径的方法等,分别从模型架构、特点、优劣等方面进行分析。介绍用于多跳推理的非结构化文本数据集和评测指标,对比各模型的性能表现。在此基础上,分析目前MHMRC研究的热点与难点,指出未来发展方向。
Compared with common single-hop Machine Reading Comprehension(MRC),Multi-Hop MRC(MHMRC)needs multi-hop reasoning from given multiple documents or paragraphs to understand and answer complex questions.Though MHMRC is extremely challenging,it is closer to human language and reasoning,and has broad application prospects.The research background of MHMRC is introduced and the existing methods are divided according to the applicable scenarios into closed set Question Answering(QA)and Open-domain Question Answering(OpenQA),mainly including methods based on question decomposition,methods based on Graph Neural Network(GNN),methods improving index and methods based on reasoning path,etc.The methods are comprehensively analyzed from the perspectives of model architecture,features,advantages and disadvantages.Then unstructured text datasets and indexes for MHMRC evaluation are described,and they are employed to compare the performance of each model.On this basis,the paper discusses the challenges and hotspots of MHMRC research and the trends of future development are discussed.
作者
苏珂
黄瑞阳
张建朋
余诗媛
胡楠
SU Ke;HUANG Ruiyang;ZHANG Jianpeng;YU Shiyuan;HU Nan(Software College,Zhengzhou University,Zhengzhou 450001,China;National Digital Switching System Engineering&Technological R&D Center,Zhengzhou 450002,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第9期1-17,共17页
Computer Engineering
基金
国家自然科学基金青年科学基金项目(62002384)
中国博士后科学基金面上项目(47698)
郑州市协同创新重大专项(162/32410218)。
关键词
机器阅读理解
多跳机器阅读理解
问题分解
图神经网络
开放域问答
Machine Reading Comprehension(MRC)
Multi-Hop MRC(MHMRC)
question decomposition
Graph Neural Network(GNN)
Open-domain Question Answering(OpenQA)