
基于语义计算的查询扩展优化研究 被引量:10

Optimization Research for Query Expansion Based on Semantic Computation
摘要 查询扩展技术是指在原有查询的基础上加入与用户输入的检索用词相关联的新词,组成新的更长、更准确的查询,用于弥补用户查询信息不足的缺陷.为了提高文本检索的效率,纳入网络检索环境下的用户个人偏好,在查询扩展技术中引入语义计算是一个重要研究方向.文章从语义计算的角度提出了基于语义关联树的查询扩展算法,通过动态生成语义关联树,有效降低词相似度矩阵计算工作量.通过控制语义关联树的层次结构及复杂度,灵活高效的生成不同语义空间模型.实验证明,该算法能有效提高文本检索的准确率. QE (Query Expansion) is to add some new correlated words to original retrieval, and make a new efficient query The main purpose of QE is to offset the defect of the user' s short query information. To improve text retrieval efficiently with user' s preference, introducing semantic computation to query expansion is a main research direction. This paper provides a new query expansion algorithm .based on semantic relation tree in the view of semantic computation. By implementing the tree dynamically, we can effectively reduce the complexity of getting words similarity matrix. By controlling the tree structure and complexity, we can effectively and neatly construct different semantic space models. The experimental result shows that this algorithm can improve the accuracy of text retrieval efficiently.
出处 《情报学报》 CSSCI 北大核心 2007年第5期704-710,共7页 Journal of the China Society for Scientific and Technical Information
关键词 语义计算 查询扩展 语义关联树 文本检索 semantic computation, query expansion, semantic relation tree, text retrieval
  • 相关文献


  • 1Furnas G W,Landauer T K,Gomez L M,Dumais S T.The vocabulary problem in human-system communication[J].Communication of ACM,1987,30(11):964-971.
  • 2Wen J R,Nie J Y,Zhang H J.Clustering user queries of a search engine[C]∥Proceedings of the 10th International World Wide Web Conference (WWW10).New York:ACM Press,2001:162-168.
  • 3Miller G,Beckwith R,et al.Introduction to WordNet:an online lexical database[J].International Journal of Lexicography,1990,3(4):234-244.
  • 4Richardson R,Smeaton A.Using WordNet in knowledge based approach to information retrieval.Working paper CA20395,Trinity College Dublin,1995.
  • 5Mark Sanderson,Bruce Croft.Deriving concept hierarchies from text[C]∥Proceedings of ACM SIGIR Conference'99,1999:206-213.
  • 6张敏,宋睿华,马少平.基于语义关系查询扩展的文档重构方法[J].计算机学报,2004,27(10):1395-1401. 被引量:55
  • 7Deerwester S,Dumai S T,Furnas G W,Landauer T K,Harshman R.Indexing by latent semantic analysis[J].Journal of ACM Transactions on Information Systems,1990,18(1):79-112.
  • 8Hofman Thoma.Probabilistic latent semantic indexing[C]∥Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval.Berkley,California,1999:50-57.
  • 9Xu J X,Croft W B.Query expansion using local and global document analysis[C]∥Frei H P,Harman D,Schauble P,Wilkinson R,ed.Proceedings of the 19th Annual International SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press,1996:4-11.
  • 10Xu J X,Croft W B.Improving the effectiveness of information retrieval with local context analysis[J].ACM Transactions on Information Systems,2000,18(1):79-112.


  • 1Voorhees E. M.. Query expansion using lexical-semantic relations. In: Proceedings of the 17th ACM SIGIR Conference on R&D in Information Retrieval, Dublin, Ireland, 1994, 61~69
  • 2Miller G. , Beckwith R. et al.. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography,1990, 3(4): 234~244
  • 3Richardson R. , Smeaton A.. Using WordNet in a knowledgebased approach to information retrieval. Trinity College Dublin, Working paper CA-0395, 1995
  • 4Smeaton A. F. , Berrut C.. Thresholding postings lists, query expansion by word-word distances and POS tagging of Spanish text. In: Proceedings of the 4th Text Retrieval Conference,Washington DC, 1996, 373~391
  • 5van Rijbergen C. J.. A theoretical basis for the use of co-occurrence data in information retrieval. Journal of Documentation,1977, 33(2): 106~119
  • 6Crouch C. J. , Yong B.. Experiments in automatic statistical thesaurus construction. In: Proceedings of the 15th International ACM/SIGIR Conference on R&D in Information Retrieval (SIGIR'92), Copenhagen, Denmark, 1992, 77 ~ 87
  • 7Schutze H. , Pedersen J. O.. A cooccurrence-based thesaurus and two applications to information retrieval. In: Proceedings of Intelligent Multimedia Information Retrieval Systems and Management (RIAO'94), New York, 1994, 266~274
  • 8Chen H. , Schatz B. et al.. Automatic thesaurus generation for an electronic community system. Journal of American Society for Information Science, 1995, 46(3): 175~193
  • 9Lin De-Kang, Zhao Shao-Jun et al.. Identifying synonyms among distributionally similar words. In: Proceedings of International Joint Conference of Artificial Intelligence (IJCAI-03),Mexico, 2003, 1492~1493
  • 10Ruge G.. Experiments on linguistically-based term associations. Information Processing and Management, 1992, 28(3):317~332












使用帮助 返回顶部