摘要
实体链接技术是将文本中的实体指称项正确链接到知识库中实体对象的过程,对知识库扩容起着关键作用。针对传统的实体链接方法主要利用上下文相似度等表层特征,而且忽略共现实体间的语义相关性,提出一种融合多特征的集成实体链接方法。首先结合同义词表、同名词表产生候选实体集,然后从多角度抽取语义特征,并将语义特征融合到构建的实体相关图中,最后对候选实体排序,选取top1实体作为链接目标。在NLP&CC2013中文微博实体链接评测数据集上进行实验,获得90.97%的准确率,与NLP&CC2013中文微博实体链接评测的最优系统相比,本文系统具有一定的优势。
Entity linking is the process of mapping entity mentions in a document to their entities in Knowledge Base(KB)and plays a key role in the expansion of knowledge base.Aiming at traditional entity linking methods,which mainly utilize surface features such as context similarity and ignore the semantic correlation between co-occur mentions in a text corpus,a collective entity linking method based on multiple features is proposed.Firstly,it combines synonym list and namesake list to produce a set of candidate entities.After that,it extracts varieties of the semantic features and builds a referent graph.At last,it ranks the candidate entities and choses the top1 entity as the linking target.The evaluation on data sets of NLP&CC2013 Chinese micro-blog entity linking track shows a average accuracy of 90.97%,which is better than the state-of-art result.
作者
冯钧
柳菁铧
孔盛球
FENG Jun;LIU Jing-hua;KONG Sheng-qiu(College of Computer and Information,Hohai University,Nanjing 211100,China)
出处
《计算机与现代化》
2019年第1期69-74,94,共7页
Computer and Modernization
基金
国家重点研发计划(2017YFC0405806)
国家自然科学基金面上项目(61602151
61370091)
关键词
中文集成实体链接
知识图谱
实体消歧
Chinese collective entity linking
knowledge graph
entity disambiguation