期刊文献+

融合多特征的中文集成实体链接方法

Chinese Collective Entity Linking Method Based on Multiple Features
下载PDF
导出
摘要 实体链接技术是将文本中的实体指称项正确链接到知识库中实体对象的过程,对知识库扩容起着关键作用。针对传统的实体链接方法主要利用上下文相似度等表层特征,而且忽略共现实体间的语义相关性,提出一种融合多特征的集成实体链接方法。首先结合同义词表、同名词表产生候选实体集,然后从多角度抽取语义特征,并将语义特征融合到构建的实体相关图中,最后对候选实体排序,选取top1实体作为链接目标。在NLP&CC2013中文微博实体链接评测数据集上进行实验,获得90.97%的准确率,与NLP&CC2013中文微博实体链接评测的最优系统相比,本文系统具有一定的优势。 Entity linking is the process of mapping entity mentions in a document to their entities in Knowledge Base(KB)and plays a key role in the expansion of knowledge base.Aiming at traditional entity linking methods,which mainly utilize surface features such as context similarity and ignore the semantic correlation between co-occur mentions in a text corpus,a collective entity linking method based on multiple features is proposed.Firstly,it combines synonym list and namesake list to produce a set of candidate entities.After that,it extracts varieties of the semantic features and builds a referent graph.At last,it ranks the candidate entities and choses the top1 entity as the linking target.The evaluation on data sets of NLP&CC2013 Chinese micro-blog entity linking track shows a average accuracy of 90.97%,which is better than the state-of-art result.
作者 冯钧 柳菁铧 孔盛球 FENG Jun;LIU Jing-hua;KONG Sheng-qiu(College of Computer and Information,Hohai University,Nanjing 211100,China)
出处 《计算机与现代化》 2019年第1期69-74,94,共7页 Computer and Modernization
基金 国家重点研发计划(2017YFC0405806) 国家自然科学基金面上项目(61602151 61370091)
关键词 中文集成实体链接 知识图谱 实体消歧 Chinese collective entity linking knowledge graph entity disambiguation
  • 相关文献

参考文献8

二级参考文献84

  • 1王燕.一种改进的K-means聚类算法[J].计算机应用与软件,2004,21(10):122-123. 被引量:9
  • 2Wang Houfeng(王厚峰),Mei Zheng.Chinese multi-document personal name disambiguation[J].High Technology Letters,2005,11(3):280-283. 被引量:8
  • 3http://www.wikipedia.org/.
  • 4BOLLEGALA D,MATSUO Y, 1SHIZUKA M. Disambiguating personal names on the Web using automatically extracted key phrases [ C ]// Proc of the 17th European Conference on Artificial Intelligence. Riva del Garda, Italy :IOS Press,2011:553-557.
  • 5WANG Hou-feng. Cross-document transliterated persona1 name core- ference resolution [ C]//Lecture Notes in Computer Science, vol 3614. 2005.
  • 6于满泉.面向人物追踪的知识挖掘研究[D].北京:中国科学院计算技术研究所,2009.
  • 7刘克彬,李芳,刘磊,韩颖.基于核函数中文关系自动抽取系统的实现[J].计算机研究与发展,2007,44(8):1406-1411. 被引量:59
  • 8Salton G,Wong A, Yang C S. A Vector Space Model for Auto- matic lndexing[J]. Communications of the ACM, 1975,18: 613- 620.
  • 9Blei D, Ng A, Jordan M. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003,3 : 993.
  • 10石晶,范猛,李万龙.基于LDA模型的主题分析[J].自动化报,2009,36:1586-1593.

共引文献1084

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部