期刊文献+

一种基于维基百科的多策略词义消歧方法 被引量:1

A Multi-strategy Method for Word Sense Disambiguation Based on Wikipedia
原文传递
导出
摘要 【目的】提出一种基于维基百科的多策略词义消歧方法,充分利用维基百科中的潜在知识进行消歧。【方法】设计类别一致性、内容相关性以及词义重要程度三个指标,并通过动态熵权线性融合各指标值以及二次消歧的方法来确定歧义词在特定语境的最佳词义。【结果】通过实验,该方法取得了74.82%的准确率,可以验证其有效性。【局限】候选词义粒度较细,且主要针对英文进行消歧,对其他语言缺少一定的普适性。【结论】维基百科为消歧提供更多的语义知识和背景信息,能够提高消歧准确率。 [Objective] This paper proposes a multi-strategy method for Word Sense Disambiguation (WSD) based on Wikipedia which makes full use of the latent knowledge in Wikipedia. [Methods] Design three indicators including category commonness, content relatedness and the importance of the word sense, make an entropy-based dynamic linear fusion of these three indicators, combined with re-disambiguation to choose the best sense of an ambiguous term in its context. [Results] Experimental result shows an average precision of 74.82%, therefore validating the feasibility and effectiveness of this method. [Limitations] The proposed method mainly aims at WSD in English with a setting of fine grained candidate senses, lacking certain generality to other languages. [Conclusions] This method provides more semantic knowledge and background information based on Wikipedia which enhance the precision of disambiguation tasks.
出处 《现代图书情报技术》 CSSCI 2015年第11期18-25,共8页 New Technology of Library and Information Service
基金 北京市自然科学基金预探索项目"发明过程和机理的概念地图表示研究"(项目编号:9153020) 2015年度北京市教委社会科学计划面上项目"一种基于概念地图的发明过程机理的描述方法"(项目编号:SM201510005001)的研究成果之一
关键词 词义消歧 维基百科 相关度 熵权 二次消歧 Word sense disambiguation Wikipedia Relatedness Entropy coefficient Re-disambiguation
  • 相关文献

参考文献19

  • 1Bhala R V V,Abirami S. Trends in Word Sense Disambigua-tion[J]. Artificial Intelligence Review, 2014, 42(2): 159-171.
  • 2Pedersen T. A Decision Tree of Bigrams is an AccuratePredictor of Word Sense [C]. In: Proceedings of the 2ndMeeting of the North American Chapter of the Association forComputational Linguistics, Carnegie Mellon University,Pittsburgh, PA, USA. Somerset: Association ComputationalLinguistics, 2001: 79-86.
  • 3Navigli R, Velardi P. Structural Semantic Interconnections: AKnowledge-based Approach to Word Sense Disambiguation[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence, 2005,27(7): 1075-1086.
  • 4Dandala B, Mihalcea R, Bunescu R. Word SenseDisambiguation Using Wikipedia [A]// The People’s WebMeets NLP: Collaboratively Constructed Language Resources[M]. Springer Berlin Heidelberg, 2013: 241-262.
  • 5王兰成,刘晓亮.维基百科知网的构建研究与应用进展[J].情报资料工作,2012,33(5):56-60. 被引量:8
  • 6Mihalcea R. Using Wikipedia for Automatic Word SenseDisambiguation [C]. In: Proceedings of the Human LanguageTechnologies: The Conference of the North American Chapterof the Association for Computational Linguistics. 2007: 196-203.
  • 7Fogarolli A.Word Sense Disambiguation Based on WikipediaLink Structure [C]. In: Proceedings of the 2009 IEEEInternational Conference on Semantic Computing (ICSC ’09),Berkeley, CA, USA. New York: IEEE, 2009: 77-82.
  • 8史天艺,李明禄.基于维基百科的自动词义消歧方法[J].计算机工程,2009,35(18):62-64. 被引量:12
  • 9Li C, Sun A, Datta A. TSDW: Two-Stage Word SenseDisambiguation Using Wikipedia [J]. Journal of the AmericanSociety for Information Science and Technology, 2013,64(6):1203-1223.
  • 10Firth J. A Synopsis of Linguistic Theory 1930—1955 [J].Special, 1957(5611): 562.

二级参考文献47

  • 1魏一鸣,童光煦,范体均.基于神经网络的多目标权重计算方法探讨[J].武汉化工学院学报,1995,17(4):37-41. 被引量:10
  • 2王元珍,钱铁云,冯小年.基于关联规则挖掘的中文文本自动分类[J].小型微型计算机系统,2005,26(8):1380-1383. 被引量:13
  • 3张文泉,张世英,江立勤.基于熵的决策评价模型及应用[J].系统工程学报,1995,10(3):69-74. 被引量:80
  • 4孙伟 王雪松 许世范 程玉虎.企业经济效益的神经网络模糊综合评判的研究[A]..Proceedings of the 3rd World Congress on Intelligent Control and Automation[C].,2000,6.1119—1121.
  • 5倪文杰,张卫国,冀小军.现代汉语辞海[M].北京:人民中国出版社,1994.6.
  • 6Galley M, McKeown K, Improving Word Sense Disambiguation in Lexical Chaining[C]//Proc. of the 18th International Joint Conference on Artificial Intelligence. Acapulco, Mexico: [s. n.], 2003: 1486-1488.
  • 7Yarowsky D. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods[C]//Proc. of the 33rd Annual Meeting of the Association for Computational Linguistics. Cambridge, Massachusetts, USA: [s. n.], 1995: 189-196.
  • 8Gey F C. Inferring Probability of Relevance Using the Method of Logistic Regression[C]//Proc. of the 17th International Conference of the ACM-SIGIR'94. [S. l.]: Springer-Verlag, 1994: 222-231.
  • 9Remy M. Wikipedia: The Free Encyclopedia[J]. Online Information Review, 2002, 26(6): 434-435.
  • 10Denoyer L, Gallinari E The Wikipedia XML Corpus[J]. SIGIR Forum, 2006, 40(1): 64-69.

共引文献319

同被引文献16

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部