期刊文献+

基于异构关系网络图的词义消歧研究 被引量:11

WSD Method Based on Heterogeneous Relation Graph
下载PDF
导出
摘要 传统的基于知识库的词义消歧方法采用同一种类型知识(语义或共现关系)进行消歧,忽略了不同类型知识之间的互补作用.针对此问题,在传统的网络图词义消歧模型基础上,通过模型重构和对比实验,提出了一种基于异构关系网络图的词义消歧模型.该模型能够把多种类型的词义消歧知识有机融合到同一个网络图中,充分利用了多种知识协同消歧的优势.同时设计并实现了一种基于模拟退火的自动估计各种知识类型关系权重的方法,以最优化各种知识对消歧效果的影响.该方法是一种无监督的词义消歧方法,可以有效克服数据稀疏及知识获取瓶颈等问题.在SemEval-2007上的测试结果表明,该方法的消歧性能优于基线方法和目前参加该项评测的最好系统. As one of the most important problems in natural language processing, word sense disambiguation (WSD) aims to identify the intended meaning (sense) of words in context. Traditional knowledge-based WSD methods usually leverage only one sort of knowledge (semantic or cooccurrence relationships) but ignore the complementarity between different types for disambiguation. To deal with this probIem, this paper proposes a novel WSD model using heterogeneous relation graph. Based on the reconstruction of traditional graph-based WSD model, different kinds of knowledge are naturally incorporated. Furthermore, since not all types of knowledge play an equally important role in WSD, an automatic parameter estimation method is designed and implemented to optimize the disambiguation effect by estimating the weight of various kinds of relations. The parameter estimation algorithm is adapted based on simulated annealing algorithm. The proposed WSD model is unsupervised. It can make full use of multi-source knowledge and alleviate the data sparseness and knowledge acquisition problems. The model is evaluated on a standard multilingual Chinese English lexical task (SemEval-2007), and the results indicate that the proposed method could significantly outperform the baseline method. Moreover, the proposed model also performs better than the best participating system in the evaluation.
出处 《计算机研究与发展》 EI CSCD 北大核心 2013年第2期437-444,共8页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61132009) 北京理工大学科技创新计划重大项目培育专项计划基金项目
关键词 多类型知识 异构关系网络图 PAGERANK 参数估计 模拟退火 multi-source knowledge heterogeneous relation graph PageRank parameter estimation simulated annealing
  • 相关文献

参考文献2

二级参考文献7

共引文献61

同被引文献85

  • 1舒昝,张晓冉.面向异构类型的大数据查询优化研究[J].自动化与仪器仪表,2016(4):199-200. 被引量:4
  • 2孔祥疆,马玉鹏,李英凡.异构数据库中的数据类型转换[J].计算机应用研究,2006,23(4):217-218. 被引量:8
  • 3魏伟.汉语离合词研究综述[J].锦州医学院学报(社会科学版),2006,4(4):80-83. 被引量:4
  • 4董振东,董强.知网[EB/OL].[2013-02-11].http:∥www.keenage.conr/zhiwang/c-zhiwang.html,.
  • 5NAVIGLI R. Word sense disambiguation: a survey [ J]. ACM Com- puting Surveys, 2009, 41(2) : 1 -69.
  • 6CHAN Y S, NG H T. Scaling up word sense disambiguation via par- allel texts[ C]//AAAI 2005: Proceedings of the 20th National Con- ference on Artificial Intelligence. Menlo Park: AAAI Press, 2005, 3:1037 - 1042.
  • 7PILEHVAR M T, JURGENS D, NAVIGLI R. Align, disambignate and walk: a unified approach for measuring semantic similarity [ C]//Proceedings of the 51 st Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computa- tional Linguistics, 2013, 1:1341 - 1351.
  • 8NAVIGLI R, PONZETTO S P. Joining forces pays off: Multilingnal joint word sense disambiguation[ C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg: Asso- ciation for Computational Linguistics, 2012:1399 - 1410.
  • 9STEVENSON M, AGIRRE E, SOROA A. Exploiting domain infor- mation for word sense disambiguation of medical documents[ J]. Journal of the American Medical Informatics Association, 2012, 19 (2) : 235 - 240.
  • 10AGIRRE E, de LACALLE O L, SOROA A. Random walks for knowledge-based word sense disambiguation [ J ]. Computational Linguistics, 2014, 40(1): 57-84.

引证文献11

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部