期刊文献+

自编码器预训练和多表征交互的段落重排序模型

Passage re-ranking model with autoencoder pre-training and multi-representation interaction
下载PDF
导出
摘要 在段落重排序任务中,最近研究人员提出了基于双编码器的后期交互架构以实现快速计算。由于这些模型在训练和推理中都使用预训练模型对查询和段落进行独立编码,其排序性能较大地依赖了预训练模型的编码质量。此外,一些多向量的后期交互方式采用字符向量之间的最大相似度之和来计算文本相似度,容易出现部分匹配的问题。针对以上不足,提出了替换段落预测(RPP)的预训练方法,它采用一种部分连接的自编码器架构,使用ELECTRA类似的替换词汇预测任务来让预训练模型建立给定查询和文档之间的语义关系,从而增强其表示能力。在交互方式改进上,设计了一种新的后期交互范式。使用不同注意力引导待排序段落文本表征,通过动态融合后使用点积与查询向量进行相似度计算,具有较低的复杂度和较细的粒度特征。在MS MACRO段落检索数据集上的重排序实验表明:在不同训练条件下,该模型比ColBERT和PreTTR在MRR@10指标上都要优秀;在使用知识蒸馏情况下,性能接近教师模型的水平,且排序时间在GPU和CPU大幅缩短。 In the tasks of passage re-ranking,recent studies propose late interaction architectures based on bi-encoders for faster computation.Since these models independently encode queries and passages during training and inference,the performance of the ranking model heavily relies on the embedding quality of the encoder.Moreover,some multi-vector late-interaction approaches,which calculate text similarity by summing the maximum similarities between character vectors,may encounter partial matching issues.To address these limitations,this paper proposed a pre-training method called replacement paragraph prediction(RPP).It adopted a partially connected autoencoder architecture and employed a task similar to ELECTRA’s replacement token prediction to enable the pre-trained model to establish semantic relationships between given queries and passages,thus enhancing its representational capacity.Regarding the improvement of interaction methods,it designed a new late-interaction paradigm.It used different attention mechanisms to guide different text representations for the passages to be ranked.It dynamically fused these representations and computes similarity with the query vector through dot product,providing a lower complexity and finer granularity in interaction.Experiments on the MS MACRO passages ranking dataset demonstrate that the proposed model outperforms ColBERT and PreTTR on the MRR@10 metric under different training conditions.When using knowledge distillation,the proposed model achieves performance comparable to that of the teacher model,and reduces the sorting time on GPUs and a CPUs.
作者 张康 陈明 顾凡 Zhang Kang;Chen Ming;Gu Fan(School of Information,Shanghai Ocean University,Shanghai 201306,China)
出处 《计算机应用研究》 CSCD 北大核心 2023年第12期3643-3650,共8页 Application Research of Computers
基金 上海市科技创新计划项目(20dz1203800)。
关键词 自编码器 预训练 重排序 后期交互 autoencoder pre-training re-ranking late interaction
  • 相关文献

参考文献1

二级参考文献1

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部