摘要
机器阅读理解是近几年来十分热门和前沿的自然语言处理研究任务之一,它能够解决传统的检索式问答最后一公里的难题,也就是精准定位答案。通过预训练好的词向量,辅以字向量微调这种字词混合Embedding作为模型的输入编码,通过LSTM建立模型提取文本特征,通过对答案的起止位置进行标注,通过移动指针的方式确定答案的区间,综合所有答案进行投票,选出最佳答案,这种对答案解码的方式是一种半指针半标注的模式。在WebQA数据集上进行问答式阅读理解实验,结果表明,召回率达到89.70%,相对基准模型提升了2.08%,F1值达到75.11%,相对基准模型提升了0.83%。
Machine reading comprehension is one of the most popular and cutting⁃edge natural language processing research tasks in recent years.It can solve the problem of the last kilometer of traditional retrieval Q&A,that is precisely locating answers.In this paper,the pre⁃trained word vector with the helping of the character vector fine⁃tuning,is used as the input code of the model,which is a hybrid Embedding of word and character.The text features are extracted by the model established by LSTM;the starting and ending positions of the answers are marked;and the range of the answers is determined by moving the pointer;the best answer is selected by voting based on all the answers,this way of decoding the answer is a half⁃pointer and half⁃label mode.The question⁃and⁃answer reading comprehension experiment was carried out on the WebQA datasets,and the results showed that the recall rate reached 89.70%,which is increased by 2.08%compared with the reference model,and the score of F1 reached 75.11%,and increased by 0.83%compared with the reference model.
作者
刘鑫
LIU Xin(Wuhan Research Institute of Posts and Telecommunications,Wuhan 430074,China)
出处
《电子设计工程》
2021年第11期166-170,共5页
Electronic Design Engineering