期刊文献+

VSM中用语片为特征项计算文本相似度 被引量:2

Method of Text Semantic Similarity Computing Based on VSM Using the Skeleton Semantic Clip
下载PDF
导出
摘要 定义了骨架语片的概念。用互信息量作为衡量两个词语间相关程度的参考值,借助依存关系、基本语法将满足相关度阈值的两个词组合成骨架语片。用骨架语片做特征项,用空间向量模型表示文本语义,用语片的出现频度做语片权重,用余弦法计算文本间语义相似度。应用于试卷主观题自动评分,实验证实这种方法结果具有令人满意的正确度。 Defining the concept of skeleton semantic clip in the paper. Comparing relevancy between two words using mutual information. Structuring two words accord with some value of mutual information through semantic dependence and basic syntax. Computing the semantic similarity of sentences by the method of cosine, eigenvalue come from the skeleton semantic clip, and the semantics of sentence expressed the vector space model. The application of the method is the auto gradeing system of subjective test questions in examination. The method is validated by some use case. The result is satisfying.
作者 潘国清
出处 《计算机与数字工程》 2007年第10期24-25,34,共3页 Computer & Digital Engineering
关键词 空间向量模型 相关度 骨架语片 互信息 相似度 vector space model,relevancy,skeleton semantic clip,mutual information,semantic similarity of sentences
  • 相关文献

参考文献5

二级参考文献19

  • 1[1]Erik F, Tjong Kim Sang,Buchholz S. Introduction to the CoNLL-2000 Shared Task: Chunking. In: Proceedings of CoNLL2000 and LLL-2000, Lisbon, Portugal, 2000. 127~132
  • 2[2]Steven A. Parsing by Chunks. In: Berwick, Abney, Tenny eds. Principle-Based Parsing: Kluwer Academic Publishers,1991. 257~278
  • 3[5]Ratnaparkhi A. A maximum entropy model for part-of-speech tagging. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1996
  • 4[6]Ratnaparkhi A. A simple introduction to maximum entropy models for natural language processing. Institute for Research in Cognitive Science, University of Pennsylvania : Technical Report 9708, 1997
  • 5[7]Berger A, Pietra S D, Pietra V D. A maximum entropy approach to natural language processing. Computational Linguistics, 1996,22(1):39~71
  • 6[8]Skut, Wojciech, Thorsten Brants. A maximum entropy partial parser for unrestricted text. In:Proceedings of the 6th Workshop on Very Large Corpora, Montreal, Canada, 1998. 143~151
  • 7[10]Abney S. Part-of-speech tagging and partial parsing. In:Church K, Young S, Bloothooft G eds. Corpus-Based Methods in Language and Speech, An ELSNET volume, Dordrecht:Kluwer Academic Publishers, 1996. 119~136
  • 8[11]Church K W. A stochastic parts program and noun phrase parser for unrestricted text. In:Proceedings of the 2nd Conference on Applied Natural Language Processing, Texas, USA, 1988.136~143
  • 9[12]Ramshaw L A, Marcus M P. Text chunking using transformation-based learning. In: Proceedings of ACL Third Workshop on Very Large Corpora, Cambridge, USA, 1995. 82~94
  • 10[13]Darroch J N, Ratcliff D. Generalized iterative scaling for loglinear models. Annals of Mathematical Statistics, 1972,43(5):1470~1480

共引文献103

同被引文献18

  • 1郑智斌,邓兰花.网络个人信源及其可信度分析[J].情报理论与实践,2008,31(6):857-859. 被引量:8
  • 2宋玲,马军,连莉,张志军.文档相似度综合计算研究[J].计算机工程与应用,2006,42(30):160-163. 被引量:43
  • 3Rheingold H. The virtual community [M]. MA: Addison Wesley, 1993: 5.
  • 4iResearch Consulting Group. Consulting Group, China Online Social Network Research Report [EB/OL]. http://www.iresearch.com.cn/ html/constulting/web2/Free- Classid- 20- id- 1081.html , 2008 - 06-09.
  • 5PAN Wei, IAN Xiaoyuan. Building a virtual community platform for subject informstion services at Shanghai Jiao Tong University Library - The Electronic Library, 2009, 27 (2).271-282.
  • 6Hilligoss B, Rieh S Y. Developing a unifying framework of credibility assesement: Construct , heuristics, and interaction in context [ J ]. Information & Management, 2008, 44 (4) : 1467 - 1484.
  • 7Pandelaere M, Dewitte S. On - Line versus Memory - based Information Credibility Inferences: Implications far Memorybasat Product Judg- meats [J]. Advances in Gonstmaer Research, 2006, 33: 565-567.
  • 8SaltinG, WongA, YangCS. A VeetorSlmeeModel for automated indexing [J]. Communications of the ACM, 1975, 18 (1): 613- 620.
  • 9朱艳春,刘鲁,张巍.基于评分用户可信度的信任模型分析与构建[J].管理工程学报,2007,21(4):150-152. 被引量:14
  • 10李媛媛,马永强.基于潜在语义索引的文本特征词权重计算方法[J].计算机应用,2008,28(6):1460-1462. 被引量:17

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部