

A Hybrid Dependency-Based Query Expansion Method
摘要 由于自然语言本身的歧义性和多样性,少数几个关键词难以表达真实的信息需求。查询扩展技术通过挖掘原始查询项的潜在信息,有效地增强了检索系统的理解能力。该文在上下文分析方法计算公式中加入了句子权重概念,即假设由原始查询项返回的句子越重要,则其中出现的词与查询项越相关。同时进一步假设,句中的词与查询项的位置关系与依赖关系也是选取扩展词的重要依据。为此,该文分别提出基于句子权重与位置上下文分析方法(Sentence Weight&Position-based Context Analysis,SWPCA),以及基于句子权重与依赖关系上下文分析方法(Sentence Weight&Dependency-based Context Analysis,SWDCA)。并将这两种查询扩展技术应用于TREC的定义类问题回答,数据显示这两种方法均取得不错成绩,而SWDCA性能更好。 To express the true intention is more difficult only by a few keywords due to ambiguity and diversity of natural language.Query expansion effectively enhances the understanding of the retrieval system by trying to dig the potential meaning of the original query.Assuming that the words in the returned sentence by the original query are more important as the sentence with high score,sentence weight is applied to calculate candidate expansion items for local context analysis.In the same time,the paper further assume that the candidate words will tied closer with the originally query if they have some position or dependency relationships.So two relation-based query expansion methods are putted forward,the first is Sentence Weight Position-based Context Analysis,called SWPCA.And the second is Sentence Weight Dependency-based Context Analysis,called SWDCA.Finally the two methods are used for the definitional question answering of TREC.The experiment data show that both methods are efficient,and SWDCA performs is a little better than SWPCA.
出处 《计算机与数字工程》 2012年第11期35-38,共4页 Computer & Digital Engineering
基金 2012年度海南省自然科学基金项目(编号:612120 No.612121) 2010年和2012年海口市重点科技项目(编号:2010071 2012050) 江西省教育科学"十二五"规划课题(编号:10YB083)资助
关键词 上下文分析法 依赖关系 查询扩展 信息检索 local context analysis dependency relation query expansion information retrieval
  • 相关文献


  • 1Antonio Ferrandez. Lexical and Syntactic knowledge for Infor- mation Retrieval[J]. Information Processing and Management, 2011:47(5).
  • 2Tonya Custis, Khalid A. Kofahi. Investigating external corpus and clickthrough statistics for query expansion in the legal do- main[C]. In Proceeding of the 17th ACM conference on Infor- mation and knowledge management (2008)..1363-1364.
  • 3Jing Bai, Dawei Song, Peter Bruza, Jian-Yun Nie, Guihong Cao. Query expansion using term relationships in language models for information retrieval[C]. In; Proceedings of the 14th ACM international conference on Information and knowledge man- agement, 2005.
  • 4Olga Vechtoraova. Murat Karamuftuoglu Query expansion with terms selected using lexical cohesion analysis of documents[J]. Information Processing and Management, 2007,43(4).
  • 5Lin D. and Pantel. P. Concept Discovery from Text. In:Proceed- ings of Conference on Computa-tional Linguistics. Taipei, Tai- wan. 2002 : 577-583.
  • 6Dekang Lin. LaTaT: Language and Text Analysis Tools[C]. In:Proceedings of Human Language Technology Conference. 2001:222-227.
  • 7Grinberg D. , J. Lafferty, D. Sleator. A robust parsing algorithm for link grammars[D]. Pittsburgh: Carnegie Mellon University, 1995.
  • 8M. Collins. Head-Driven Statistical Models for Natural Lan- guage Parsing [D]. Philadelphia: University of Pennsylvania, 1999.
  • 9Dan Klein and Christopher D. Manning. Fast Exact Inference with a Factored Model for Natural Language Parsing[J]. Neu- ral Information Processing Systems (NIPS), 2002, (15).
  • 10Hang Cui, Keya Li, Renxu Sun, Tat-Seng Chua, Min-Yen kan. National University of Singapore at the TREC-13 Question Answering Main Task[C]. In: Proceedings of the Thirteenth Text REtreival Conference,2004.


  • 1NIST.Text Retrieval Conference[EB/OL].[2003-04-08] http:∥trec.nist.gov/
  • 2钱学森图书馆医学分馆.信息检索基础知识:检索效率及评价[EB/OL].[2007-01-10] http:∥
  • 3LEE H M,LIN S K,HUANG C W.Interactive query expansion based on fuzzy association thesaurus for Web information retrieval[C]∥ Proceedings of the 10th IEEE International Conference on Fuzzy Systems.Australia:[s n],2001:724-727
  • 4LIM J,SEUNG H,HWANG J,et al.Query expansion for intelligent information retrieval on internet[C]∥ Proceedings of Parallel and Distributed Systems International Conference.Washington:IEEE Computer Society,1997:656-662
  • 5Cognitive Science Laboratory,Princeton University.WordNet[EB/OL].[2003-10-10] http:∥www.cogsci.princeton.edu/~wn/
  • 6MANDALA R,TOKUNAGA T,TANAKA H.Query expansion using heterogeneous thesauri[J].Inf Process and Manage,2000,36:361-378
  • 7SALTON G,WONG A,YANG C S.A vector space model for automatic indexing[J].Commun of the ACM,1975,18(11):613-620
  • 8JING Li-ping,HUANG Hou-kuan,SHI Hong-bo.Improved feature selection approach TFIDF in text mining[C]∥ Proceedings of 1st Information Conference on Machine Learning and Cybernetics.Beijing:[s n],2002:944-946
  • 9贺宏朝,何丕廉,陈霞.利用人工和自动生成的资源进行中文信息检索查询扩展[J].计算机工程与应用,2002,38(21):18-20. 被引量:4









使用帮助 返回顶部