摘要
目前知识库问答的实体链接主要存在如下困难:问句实体名拼写不规范;知识库中存在大量缺乏区分特征的相似实体;问句缺乏上下文。为了解决上述问题,提出一种多维度匹配的实体链接方法,分别对文本、统计与实体属性维度逐步进行匹配实体候选集。其中,文本维度设置多级阈值,从知识库筛选出合理大小的实体候选集;统计维度根据实体热度得分,对实体候选集进行重排序;实体属性维度利用基于Transformer的关系预测模型对问句中实体关系进行预测,根据实体关系匹配出最终链接的实体。该方法在SimpleQuestions数据集上的准确度达到85.46%,较现有模型提升3.8%。
At present,the entity linking of the knowledge base question answering mainly has the following difficulties:the spelling of the entity nameof the question is not standardized;there are a large number of similar entities in the knowledge base that lack distinguishing features;thequestion lacks context.In order to solve the above problems,a multi-dimensional matching entity linking method is proposed,which gradu-ally matches the entity candidate set for the text,statistics and entity attribute dimensions.Among them,the text dimension sets multi-levelthresholds to filter out a reasonable size entity candidate set from the knowledge base.The statistic dimension reorders the entity candidateset according to the entity hot score.The entity attribute dimension uses the relationship prediction model which based on Transformer topredict the entity relationship in the question,and matches the final linked entity according to the entity relationship.The accuracy of thismethod on the SimpleQuestions dataset reaches 85.46%,which is a 3.8%improvement over the SOTA model.
作者
张森
张攀
ZHANG Sen;ZHANG Pan(College of Computer Science,Sichuan University,Chengdu 610065;National Key Laboratory of Fundamental Science on Synthetic Vision,Sichuan University,Chengdu 610065)
出处
《现代计算机》
2021年第12期8-13,18,共7页
Modern Computer
基金
四川省科技计划项目(重点研发项目)(No.2020YFG0327、2020YFG0306)。