期刊文献+

一种基于WordNet的短文本语义相似性算法 被引量:34

An Algorithm for Semantic Similarity of Short Text Based on WordNet
下载PDF
导出
摘要 短文本语义相似性计算在文献检索、信息抽取、文本挖掘等方面应用日益广泛.本文提出了一种短文本语义相似性计算算法ST-CW.此算法使用WordNet和Brown文集来计算文本中的概念相似性,在此基础上提出了一个新的方法综合考虑概念、句法等信息来计算短文本的语义相似性.在R&B及Miller数据集上进行实验,实验结果验证了算法的有效性. The algorithm for semantic similarity of short text is used widely in document retrieval,information extraction and text mining.An algorithm for semantic similarity of short text named ST-CW is presented.The algorithm calculates semantic similarity of concept based on WordNet and The Brown Corpus,and then a formula is presented which refers to both concept similarity and syntactic information in short text.The evaluations are conducted on RB and Miller dataset.
出处 《电子学报》 EI CAS CSCD 北大核心 2012年第3期617-620,共4页 Acta Electronica Sinica
基金 国家自然科学基金项目资助(No.61175023 No.60903097)
关键词 短文本语义相似性 WORDNET 基于文集的方法 semantic similarity of short text WordNet corpus-based method
  • 相关文献

参考文献15

  • 1杨震,范科峰,雷建军,郭军.基于语义的文本流形研究[J].电子学报,2009,37(3):557-561. 被引量:10
  • 2T K Landauer,D Laham,B Rehder,M E Schreiner.How wellcan passage meaning be derived without using word order?Acomparison of latent semantic analysis and humans[A].Proc19th Ann Meeting of the Cognitive Science Soc[C].Mawh-wah,NJ:Lawrence Erlbaum,1997.412-417.
  • 3Jiang,Jay J,David W Conrath.Semantic similarity based oncorpus statistics and lexical taxonomy[A].Proceedings of In-ternational Conference on Research in Computational Linguis-tics[C].Taiwan:IEEE,1997.19-33.
  • 4C Burgess,K Livesay,K Lund.Explorations in context space:words,sentences,discourse[J].Discourse Processes,1998,25(2-3):211-257.
  • 5张东娜,周春光,刘彦斌,郭东伟.一种基于WordNet和Corpus Statistics的语义相似性计算方法[J].吉林大学学报(理学版),2010,48(5):811-816. 被引量:6
  • 6E K Park,D Y Ra,M G Jang.Techniques for improving webretrieval effectiveness[J].Information Processing and Manage-ment,2005,41(5):1207-1223.
  • 7WordNet Documentation[EB/OL].http://wordnet.princeton.edu/wordnet/documentation/,October 27,2010.
  • 8Li Y,Mclean D,Bandar Z,O’Shea J,Crockett K.Sentencesimilarity based on semantic nets and corpus statistics[J].IEEETransactions on Knowledge and Data Engineering,2006,18(8):1138-1149.
  • 9Madhavan J,Bernstein P,Doan A,Halevy A.Corpus-basedschema matching[A].Proceedings ofthe International Confer-ence on Data Engineering[C].Tokyo:IEEE Computer Soc-iety,2005.57-68.
  • 10G A Miller.WordNet:a lexical database for english[J].Comm ACM,1995,38(11):39-41.

二级参考文献29

  • 1Bregler C, Omohundro S. Nonlinear manifold learning for visual speech recognition [ A ]. Proc of Fifth Int. Conf. on Computer Vision[ C ]. Washington, DC, USA: IEEE Computer Society, 1995.494.
  • 2Roweis S, Saul L. Nonlinear dimensionality reduction by locally linear embedding [ J]. Science, 2000,290(5500) : 2323 - 2326.
  • 3Seung H S,Lee D D. The manifold ways of perception[ J]. Science, 2000,290(5500) : 2268 - 2269.
  • 4Tenenbaum J, Silva D D, Langford J. A global geometric framework for nonlinear dimensionality reduction[J]. Science, 2000,290(5500) : 2319 - 2323.
  • 5Donoho D, Grimes C. Hessian eigenmaps: Locally linear embedding techniques for highdimensional data[ J ]. PNAS, 2003, 100(10) : 5591 - 5596.
  • 6Belkin M, Niyogi P. Laplacian eigenmaps for dimensionality reduction and data representation[ J ]. Neural Computation, 2003, 15(6) : 1373 - 1396.
  • 7Coifman R, Lafon S, Lee A, et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusionmaps[ J]. PNAS, 2005,102(21 ) : 7426 - 7431.
  • 8Kouropteva O, Okun O, Pietikaien M. Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine[ A]. Proc of the llth European Symposium on Artificial Neural Networks [ C]. Bruges, Belgium: D-side publi, 2003.229 - 234.
  • 9Lee D, Seung H. Learning the parts of objects by non-negative matrix factorization[J]. Nature, 1999,401 : 788 - 791.
  • 10Chen B, He H, Xu W, et al. POC-NLW template based tagging method for Chinese word segmentation [ A ]. Proc. 2006 Int. Conf. on Computational Intelligence and Security [C]. Guangzhou, China: IEEE,2006. 1423 - 1428.

共引文献14

同被引文献279

引证文献34

二级引证文献289

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部