期刊文献+

基于奇异值分解的英文文本检索算法 被引量:2

English Texts Retrieval Algorithm Based on SVD
下载PDF
导出
摘要 提出一种英文文本检索算法,从文本中提取关键词项,根据转移概率计算出关键词项的状态矩阵,并通过奇异值分解,提取第一奇异值向量作为复特征向量,利用向量间的余弦相似度作为文本检索的相似度度量。实验结果表明,该算法在检索准确率和运算效率上都优于传统的LSA算法。 A new retrieval algorithm for English texts is proposed. Keywords are extracted from the English texts. The state matrix of keywo(ds is calculated based on transition probabilities matrix and the first singular value vector is got through Singular Value Decomposition(SVD) as the complex feature vectors. The cosine similarity of texts is used to~ measure the similarity between the query and documents. Experimental results indicate that this algorithm gets the advantage over the traditional LSA algorithm in precision and computational efficiency.
作者 高仕龙
出处 《计算机工程》 CAS CSCD 北大核心 2011年第1期78-80,共3页 Computer Engineering
基金 四川省教育厅基金资助项目“基于混沌系统的线性调频信号检测与参数估计”(09ZB026)
关键词 文本检索 转移概率 奇异值分解 状态矩阵 texts retrieval transition probability Singular Value Deeompositinn(SVD) state matrix
  • 相关文献

参考文献6

  • 1Deerwester S, Dumais S T, Furnas G W, et al. Indexing by Latent Semantic Analysis[J]. Journal of the American Society of Information Science, 1990, 41(6): 391-407.
  • 2卫威,王建民.一种大规模数据的快速潜在语义索引[J].计算机工程,2009,35(15):35-37. 被引量:10
  • 3Salton G,Wong A, Yang Chung-Shu. A Vector Space Model for Automatic Indexing[J]. Communications of the ACM, 1975, 18 (11): 613-620.
  • 4Kalt T. A New Probabilistic Model of Text Classification and Retrieval[R]. Amherst, USA: Center for Intelligent Information Retrieval, University of Massachusetts Amherst, Technical Report: IR-78, 1996.
  • 5Lewis D D. Naive(Bayes) at Forty : The Independence Assumption in Information Retrieval[C]//Proc. of EMCL'98. Berlin, Germany: Springer, 1998.
  • 6Landauer T K. A Solution to Plato's Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation of Knowledge[J]. Psychological Review, 1997, 104(2) : 211-240.

二级参考文献6

  • 1何明,冯博琴,傅向华.基于Rough集潜在语义索引的Web文档分类[J].计算机工程,2004,30(13):3-5. 被引量:7
  • 2Scott C D,Dumais S T,Thomas K L,et al.Indexing by Latent Semantic Analysis[J].Journal of the American Society for Information Sciences,1990,41 (6):391-407.
  • 3Tang Chunqiang,Dwarkadas S,Xu Zhichen.On Scaling Latent Semantic Indexing for Large Peer-to-Peer Systems[C]//Proceedings of the 27th Annual international ACM SIGIR Conference on Research and Development in Information Retrieval.NY,USA:ACM Press,2004:112-121.
  • 4Kolda T G,O'Leary D P.A Semidiscrete Matrix Decomposition for Latent Semantic Indexing Information Retrieval[J].ACM Trans.on Inf.Syst.,1998,16(4):322-346.
  • 5Karypis G,Hart E H S.Concept Indexing:A Fast Dimensionality Reduction Algorithm with Application to Document Retrieval and Categorization[C]//Proceedings of CIKM'00.McLean,VA,USA:[s.n.],2000:12-19.
  • 6Bingham E,Mannila H.Random Projection in Dimensionality Reduction:Applications to Image and Text Data[C]//Proceedings of KDD'01.San Francisco,CA,USA:[s.n.],2001:245-250.

共引文献9

同被引文献35

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部