期刊文献+

基于词关联度的文本检索系统

A Text Search System Based on Word Relation
下载PDF
导出
摘要 基于对语料的统计分析,提出了词关联度的概念。通过对文本库中词语出现的频率,以及任意两个词语共同出现的频率进行统计,获得了各个词语之间的关联度,并使用这一参数对语义向量进行调整,可以有效地解决传统向量空间模型的单词依赖问题。结合倒排索引技术,实际建立了一个相当规模的文本检索系统。测试结果表明,系统具有较好的效果和良好的性能,具备实用价值。 This paper introduces the concept of word relation, which reflects the statistical property of a text collection. Word relations are defined by the number of documents containing certain word and word pairs. It is used in adjusting semantic vector to solve the word dependency problem in traditional vector space model. This paper has implemented a text search system based on word relation, also integrated with inverted index. Several design issues are discussed in detail. It shows both good precision and sat...
出处 《微型电脑应用》 2011年第3期62-64,6,共4页 Microcomputer Applications
关键词 词关联度 信息检索 向量空间模型 倒排索引 Word Relation Information Retrieval VSM Inverted Index
  • 相关文献

参考文献7

  • 1Michael McCandless,Erik Hatcher,Otis Gospodneti.Lucene in Action[]..2009
  • 2A. N. Langville,C. D. Meyer.A survey of eigenvector methods for web information retrieval[].SIAM Review.2005
  • 3Berry M W,Dumais S T O’Brien G W.Using linear algebra for intelligent information retrieval[].SIAM Review.1995
  • 4SC Deerwester,ST Dumais,TK Landauer,et al.Indexing by Latent Semantic Analysis[].Journal of the American Society for Information Science.1990
  • 5Golub GH,Van Loan CF.Matrix Computations[]..1996
  • 6Davulcu H,Kifer M,Ramakrishnan C R,et al.Logic based modeling and analysis of workflows[].Process of the ACM Symposium on Principles of Database Systems(PODS’).1998
  • 7Salton G,Wong A,Yang CS.A vector space model for automatic indexing[].Communications of the ACM.1975

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部