期刊文献+

一种改进的基于《知网》的词语相似度计算方法 被引量:26

Modified word similarity computation approach based on HowNet
下载PDF
导出
摘要 《知网》是一部比较详尽的中文语义知识词典,共用1618个义原描述词语,故相关的词语用《知网》的概念描述时,有相同的义原。通过这一规律,与当前的词语相似度计算方法结合,提出改进的方法计算相关词对的相似度。并引入弱义原的概念,排除弱义原对词语相似度计算的干扰。实验证明:该改进方法更符合人的直观,更适用于文本挖掘。 HowNet is a lexical base with rich semantic information. It uses 1618 sememes to describe words. The related words have the same sememe when they are described by the HowNet. Combined with the current computation algorithm of the words' similarity, the paper proposed an improved algorithm to compute the similarity between the related words. It also introduced concept about weak sememes and excluded such sememes' interference when they appeared in the computation of the word's similarity. The experiment proves the improved word similarity computation meets the peoples' intuition and text mining better.
出处 《计算机应用》 CSCD 北大核心 2009年第1期217-220,共4页 journal of Computer Applications
关键词 《知网》 词语相似度 相关词对 弱义原 HowNet word similarity related word weak sememe
  • 相关文献

参考文献5

二级参考文献11

  • 1马庆株.关于《语法研究入门》的组编[J].世界汉语教学,1999,13(4):96-99. 被引量:2
  • 2George A.Miller,Richard Beckwith,Christiane Fellbaum,Derek Gross,and Katherine Miller.Introduction to WordNet:An On-line Lexical Database[EB].Cognitive Science Laboratory,Princeton University,1993.51 ~ 57
  • 3关毅,王晓龙.基于统计的汉语词汇间语义相似度计算.语言计算与基于内容的文本处理,清华大学出版社,2003.221~227
  • 4Rada R.etc.Development and application of a metric on semantic nets.IEEE Transactions on System,Man and Cybernetics,1989
  • 5Lee J.H.etc.Information retrieval based on conceptual distance in ISA hierarchies.Journal of Documentation,1993(49)
  • 6Agirre E.and Rigau G..A proposal for word sense disambiguation using conceptual distance.In:International Conference "Recent Advances in Natural Language Processing"RANLP'95,Tzigov Chark,Bulgaria,1995.91 ~ 98
  • 7P.Brown etc.Word sense disambiguation using tactical methods.In:Proceedings of 29th Meeting of the Association for Computational Linguistics (ACL-91),1991.201 ~ 207
  • 8Lillian Lee.Similarity-Based Approaches to Natural Language Processing:[Ph.D.Thesis].Harvard University Technical Report TR-11-97
  • 9刘群 李素建.基于《知网》的词汇语义相似度计算[A]..Computational Linguistics and Language Processing[C].,2002.7.2:59-76.
  • 10于江生,俞士汶.中文概念词典的结构[J].中文信息学报,2002,16(4):12-20. 被引量:67

共引文献224

同被引文献211

引证文献26

二级引证文献125

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部