期刊文献+

融合多特征和由粗到精排序模型的短文本实体消歧方法 被引量:1

Short Text Entity Disambiguation Method Combining Multiple Featuresand Coarse-to-fine Ranking Model
下载PDF
导出
摘要 针对短文本问句实体消歧中缺少实体描述信息和使用缩略词导致无法召回目标实体的问题,提出了一种融合多特征和由粗到精排序模型的短文本问句实体消歧方法。首先,使用N-Gram分词模型辅助召回候选实体,然后选取候选实体在知识图谱中的关系和相邻实体,分别计算与问句的相似度,作为实体在知识图谱中的描述信息,结合实体重要性等多个特征进行特征拟合;最后,通过粗排模型减少候选实体集合的数量,再经过精排模型排序得到最终的目标实体。在CCKS2019-CKBQA的数据集上的实体消歧实验表明,本文模型的准确率达到91.35%。 In order to solve the problem of missing entity description information and the inability to recall target entity by the use of abbreviations,an entity disambiguation method for short text questions was proposed that incorporate multiple features and a coarse-to-fine ranking model.First,the N-Gram word separation model is used to assist in the recall of candidate entities.Then the relationships and neighboring entities of the candidate entities are selected in the knowledge graph.The similarity with the interrogative sentences is calculated respectively,which are used as the description information of the entities in the knowledge graph,and combined with multiple features such as entity importance for feature fitting.Finally,the coarse ranking model is used to reduce the number of candidate entities set and then sorted by the fine ranking model to get the final target entities.The proposed method was evaluated on the dataset of CCKS2019-CKBQA.Experimental results showed that the proposed method reaches an accuracy of 91.35%.
作者 王荣坤 宾晟 孙更新 WANG Rong-kun;BIN Sheng;SUN Geng-xin(College of Computer Science & Technology, Qingdao University, Qingdao 266071, China)
出处 《青岛大学学报(自然科学版)》 CAS 2022年第3期16-21,共6页 Journal of Qingdao University(Natural Science Edition)
基金 教育部人文社会科学研究青年项目(批准号:15YJC860001)资助 山东省自然基金(批准号:ZR2017MG011)资助 山东省社会科学规划项目(批准号:17CHLJ16)资助。
关键词 实体消歧 短文本问句 特征融合 CKBQA 排序模型 知识图谱 entity disambiguation short textual question incorporates multiple features CKBQA ranking model knowledge graph
  • 相关文献

参考文献11

二级参考文献102

  • 1赵作鹏,尹志民,王潜平,许新征,江海峰.一种改进的编辑距离算法及其在数据处理中的应用[J].计算机应用,2009,29(2):424-426. 被引量:51
  • 2车万翔,刘挺,秦兵,李生.基于改进编辑距离的中文相似句子检索[J].高技术通讯,2004,14(7):15-19. 被引量:64
  • 3王春,吴秋华,王志,陈大刚.槲皮素与牛血清白蛋白相互作用的研究[J].光谱学与光谱分析,2006,26(9):1672-1675. 被引量:40
  • 4樊冬丽,廖庆文,鄢丹,马小军,肖小河,赵艳玲.基于生物热力学表达的麻黄汤和麻杏石甘汤的寒热药性比较[J].中国中药杂志,2007,32(5):421-424. 被引量:19
  • 5Hachey B, Radford W, Nothman J, et al. Evaluating entity linking with wikipedia [ J ]. Artificial Intelligence, 2013, 194(4) : 130-150.
  • 6Dill S, Eiron N, Gibson D, et al. SemTag and Seeker: Bootstrapping the Semantic Web via Automated Semantic Annotation[ C l// Proceedings of the 12th International Conference on World Wide Web, Budapest, Hungary, 2003 : 178-186.
  • 7Bunescu R C, Pasca M. Using Encyclopedic Knowledge for Named entity Disambiguation[ C]//Proceedings of the llth Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, 2006 : 9-16.
  • 8Bollacker K, Evans C, Paritosh P, et al. Freebase: a Collaboratively Created Graph Database for Structuring Human Knowledge [ C ]//Proceedings of the 2008 ACM SIGMOD international Conference on Management of Data, Vancouver, BC, Canada, 2008: 1247-1249.
  • 9Suchanek F M, Kasneci G, Weikum G. Yago: a Core of Semantic Knowledge [ C ]//Proceedings of the 16th International Conference on World Wide Web, Banff, Alberta, Canada, 2007: 697-706.
  • 10Pantel P, Fuxman A. Jigs and Lures: Associating Web Queries with Structured Entities [ C ]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, USA, 2011 : 83-92.

共引文献63

同被引文献8

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部