摘要
信息检索(information retrieval,IR)一直是自然语言处理(natural language processing,NLP)中的研究热点,随着深度学习在NLP任务中的不断发展,研究者尝试使用神经信息检索模型成功捕获了查询与待检索文档之间的关联匹配信息,但是现有的工作通常是以词为单位做关联匹配,没有充分考虑词序以及词的上下文信息,无法解决语句中可能存在的一词多义问题。为了获取查询与待检索文档之间的深层交互信息,对句子级深度关联匹配模型进行了研究,以相对于词来说语义更加完整的句子为单位对查询和待检索文档进行切分,对每一个查询句,计算与待检索文档中每个句子的相似度得分并按照相似度等级映射成固定长度的局部关联匹配直方图,使用前馈匹配网络学习层次匹配信息为每个查询句计算一个匹配分数,门控网络聚合全部查询句的匹配分数以获取最终查询-文档对的相似度得分。在Med数据集上的实验结果表明,句子级深度关联匹配模型较传统的检索模型以及一些无监督句子级检索模型能有效提高检索性能。
Information retrieval has always been a hot issue in natural language processing.In recent years,deep learning has led to exciting breakthroughs in NLP tasks,with its continuous development,researchers have tried to use neural information retrieval model to successfully capture the relevance matching information between queries and documents to be retrieved.However,the existing work usually carries out relevance matching at the word level,without giving full consideration to word order and the semantic relations between words.In order to obtain the deep interaction information between query and documents to be retrieved,a deep relevance matching model at sentence level is studied,the query and the documents to be retrieved are segmented by sentences that are semantically more complete than words,for each query sentence,mapping the variable-length local interaction into a fixed-length matching histogram according to the level of the similarity.Then a feed-forward neural matching network and a term gating network are used to obtain the final similarity score between the query and the document pairs.Experimental results on the MED dataset show that the proposed model outperforms some traditional retrieval model as well as unupervised sentence level models.
作者
田媛
郝文宁
陈刚
靳大尉
邹傲
TIAN Yuan;HAO Wen-ning;CHEN Gang;JIN Da-wei;ZOU Ao(School of Command&Control Engineering,Army Engineering University of PLA,Nanjing 210001,China)
出处
《计算机技术与发展》
2022年第6期9-14,20,共7页
Computer Technology and Development
基金
国家自然科学基金(61806221)。
关键词
信息检索
句子级
深度关联匹配
前馈匹配网络
门控网络
information retrieval
sentence level
deep relevance matching
feed-forward neural network
gating network