摘要
本论文研究零样本实体链接任务。当前的两阶段方法主要存在2个问题:(1)在候选实体生成阶段,由于过分追求效率,没有充分考虑指称项所在文本和实体摘要之间的交互,导致召回率不高;(2)在候选实体排序阶段,只是单独地考虑了每个候选实体和指称项的关系,这在一定程度上影响了整体的精度。针对这些问题,本文提出了一种基于ColBert-EL和MRC模型的零样本实体链接方法。在候选实体生成阶段,提出了一个基于ColBert的变种方法—ColBert-EL,既可以让指称项所在文本和实体摘要进行充分交互,又可以快速地检索。在候选实体排序阶段,将其建模成一个多项选择问题,并提出了一个基于机器阅读理解的模型来对结果进行统一排序。实验结果验证了本文提出方法的有效性。
This paper studies the zero-shot entity linking task.The current two-stage method has two main problems:(1)in the candidate generation stage,due to excessive pursuit of efficiency,the interaction between the context where the mention is in and the entity description is not fully considered,resulting in a low recall;(2)the candidate ranking stage only considers the relationship between each candidate entity and the mention individually,which affects the overall accuracy to a certain extent.In response to these problems,this paper proposes a zero-shot entity linking method based on ColBert-EL and MRC models.In the stage of candidate generation,the paper proposes ColBert-EL model-a variant method based on ColBert,which can fully interact with the text of the mention and the entity description,and can also be retrieved quickly.In the candidate ranking stage,the paper models it as a multiple choice problem and proposes a model based on machine reading comprehension to rank the results uniformly.The experimental results verify the effectiveness of the method proposed in this paper.
作者
王雪莹
程路易
徐波
WANG Xueying;CHENG Luyi;XU Bo(College of Computer Science and Technology,Donghua University,Shanghai 201620,China)
出处
《智能计算机与应用》
2022年第6期78-83,共6页
Intelligent Computer and Applications
关键词
零样本
实体链接
候选实体生成
候选实体排序
阅读理解模型
zero shot
entity linking
candidate generation
candidate ranking
reading comprehension model