期刊文献+

基于关键短语抽取与答案过滤的问答对生成

Question-answer Pair Generation Based on Key Phrase Extraction and Answer Filtering
下载PDF
导出
摘要 高质量的问答对有助于从文章中获取知识,提高问答系统性能,促进机器阅读理解,在人类活动和人工智能领域中都起着较为重要的作用.当前主要问答对生成方法依靠提供文章中的候选答案,根据答案生成特定的问题.然而一些候选答案可能会生成无法从文章中回答的问题,或是生成问题的答案不再是候选答案,造成问答对相关性差,影响问答对的质量.针对此问题,本文提出了一个基于关键短语抽取与过滤生成问答对的方法.该方法能够在输入文本中自动抽取适合生成问题的关键短语作为候选答案,再根据候选答案在问题生成器和答案生成器中生成问答对,并通过对比候选答案与生成答案的相似度过滤相关性低的问答对,最终输出保证质量的问答对.本方法在SQUAD1.1和NewsQA数据集上进行了实验验证,并人工检验了生成的问答对的质量,结果表明该方法可以有效提高生成的问答对的质量. High-quality question-answering plays an important role in human activities and artificial intelligence because it can help to obtain knowledge from articles,improve the performance of question-answering systems,and promote machine reading comprehension.The current mainstream question-answer pair generation methods usually rely on candidate answers in the provided article to generate specific questions based on these answers.However,some candidate answers may generate questions that cannot be answered from the article,or the answers to the generated questions are no longer the same as the candidate answers,which thus results in a poor correlation of the question-answer pairs and affects the quality of the question-answer pairs.In order to solve these problems,this study proposes a method to generate question-answer pairs based on key phrase extraction and filtering.The method can automatically extract key phrases suitable for generating questions from the input text as the candidate answers and then generate question-answer pairs by a question generator and an answer generator according to the candidate answers.Finally,the method outputs questionanswer pairs with high quality by comparing the similarity between the candidate answers and the generated answers and filtering out those question-answer pairs that have a low correlation with the candidate answers.The proposed method is evaluated by experiments on SQUAD1.1 and NewsQA datasets,and the quality of generated question-answer pairs is manually checked.The results show that this method can effectively improve the quality of generated question-answer pairs.
作者 郭峥嵘 郭躬德 王晖 GUO Zheng-Rong;GUO Gong-De;WANG Hui(College of Computer and Cyber Security,Fujian Normal University,Fuzhou 350117,China;School of Electronics,Electrical Engineering and Computer Science,Queen’s University Belfast,Belfast BT95BN,United Kingdom)
出处 《计算机系统应用》 2023年第6期293-300,共8页 Computer Systems & Applications
基金 国家自然科学基金(61976053,62171131) 福建省自然科学基金(2022J01398)。
关键词 问答对 候选答案 关键短语抽取 T5模型 相似度过滤 questions-answer pair candidate answer key phrase extraction T5 model similarity filtering
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部