期刊文献+

自动获取不同义项的相似词算法

A SIMILAR WORDS ALGORITHM AUTOMATICALLY CLASSIFYING DIFFERENT SENSES
下载PDF
导出
摘要 词汇相似度广泛应用于自然语言处理的多个领域。然而词汇相似度的计算一般都是基于词而不是基于词的义项来进行的。针对这种情况,提出一种相似词的分类算法。算法首先采用PMImax工具来计算目标词的相似词,然后以Word Net的义项为参照,采用一种改进后的Lesk算法自动将这些相似词按照不同的义项进行分类,每一类相似词只跟对应的义项相似。实验结果表示,该算法的分类正确率可达到84.27%。 Word similarity is widely used in quite a few natural language processing fields. However usually the word similarity measures are computed based on words rather than word senses. Aiming at this situation,in this paper we present a classification algorithm for similar words. It first uses PMImaxtool to calculate similar words of the target words,and then takes the sense of Word Net as reference,it adopts an improved Lesk algorithm to automatically classify these similar words according to different senses,and the similar word of each class only resembles the corresponding sense. Experimental results suggest that this algorithm achieves the classification accuracy up to 84. 27%.
作者 王永生
出处 《计算机应用与软件》 CSCD 2015年第3期258-260,288,共4页 Computer Applications and Software
基金 教育部人文社会科学研究基金青年项目(07JC740009)
关键词 词汇相似度 点互信息 Lesk算法 WORDNET Word similarity Point mutual information(PMI) Lesk algorithm Word Net
  • 相关文献

参考文献21

  • 1Latreche Amina,Guezouli Larbi.Similarity measure for semi-structured information retrieval based on the path and neighborhood[C].ICITeS2012.2012.
  • 2Kobylinskiukasz,Kopec Mateusz.Semantic similarity functions in word sense disambiguation[C]//Proceedings of the 15th International Conference,TSD 2012.2012:31-8.
  • 3Meng Lingling,Gu Junzhong.A new method for calculating word sense similarity in Word Net[J].International Journal of Signal Processing,Image Processing and Pattern Recognition,2012,5(3):197-206.
  • 4Ren Wuling,Guo Jinju.Word similarity algorithm based on wordnet and hownet[J].Applied Mechanics and Materials,2012,155-156:375-380.
  • 5Chen Li,Song Zilin,Miao Zhuang,et al.Determining similarity between concepts in corpus[J].Lecture Notes in Electrical Engineering,2012,113:1035-1041.
  • 6张东娜,周春光,刘彦斌,郭东伟.一种基于WordNet和Corpus Statistics的语义相似性计算方法[J].吉林大学学报(理学版),2010,48(5):811-816. 被引量:6
  • 7Harris Z.Distributional structure[M]//katz,J J.The philosophy of Linguistics.New York:Oxford University Press,1985:26-47.
  • 8Yang D,Powers D M.Automatic thesaurus construction[C]//Proc.31stAustralasian Conf.on Computer Science,2008,74:147-156.
  • 9Church K,Hanks P.Word association norms,mutual information and lexicography[C]//Proc.27thAnnual Conf.of the ACL.1989:76-83.
  • 10Turney P.Mining the web for synonyms:PMI-IR versus LSA on TOEFL[C]//Proc.12thEuropean Conf.on Machine Learning,2001:491-502.

二级参考文献14

  • 1Aminul I,Diana I.Semantic Text Similarity Using Corpus-Based Word Similarity and String Similarity[J].ACM Transactions on Knowledge Discovery from Data,2008,2(2):Article 10.
  • 2Park E K,Ra D Y,Jang M G.Techniques for Improving Web Retrieval Effectiveness[J].Information Processing and Management,2005,41(5):1207-1223.
  • 3Budanitsky A,Hirst G.Evaluating WordNet-Based Measures of Lexical Semantic Relatedness[J].Computational Linguistics,2006,32(1):13-47.
  • 4LI Yu-hua,Bandar Z H,McLean D.An Approach for Measuring Semantic Similarity between Works Using Multiple Information Sources[J].IEEE Trans Knowledge and Data Eng,2003,15(4):871-882.
  • 5Leacock C,Chodorow M.Combining Local Context and WordNet Similarity for Word Sense Identification[M].Cambridge:MIT Press,1998:147-165.
  • 6Resnik P.Using Information Content to Evaluate Semantic Similarity in a Taxonomy[C]//Proceedings of the 14th International Joint Conference on Artificial Intelligence.San Francisco:Morgan Kaufmann Publishers Inc,1995:448-453.
  • 7Lin D.An Information-Theoretic Definition of Similarity[C]//Proceedings of the 15th International Conference on Machine Learning.San Francisco:Morgan Kaufmanm,1998:296-304.
  • 8Jiang J J,Conrath D W.Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy[C]//Proceedings of International Conference on Research in Computational Linguistics.Taiwan:[s.n.],1997:19-33.
  • 9Lenhart Schubert,Matthew Tong.Extracting and Evaluating General World Knowledge from the Brown Corpus[C]//Human Language Technology Conference,Proceedings of the HLT-NAACL 2003 Workshop on Text Meaning-Volume 9.Morristown:Association for Computational Linguistics,2003:7-13.
  • 10Rubenstein H,Goodenough J B.Contextual Correlates of Synonymy[J].Comm ACM,1965,8(10):627-633.

共引文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部