摘要
To eliminate the mismatch between words of relevant documents and user's query and more seriousnegative effects it has on the performance of information retrieval,a method of query expansion on the ba-sis of new terms co-occurrence representation was put forward by analyzing the process of producingquery.The expansion terms were selected according to their correlation to the whole query.At the sametime,the position information between terms were considered.The experimental result on test retrievalconference(TREC)data collection shows that the method proposed in the paper has made an improve-ment of 5%~19% all the time than the language modeling method without expansion.Compared to thepopular approach of query expansion,pseudo feedback,the precision of the proposed method is competi-tive.
To eliminate the mismatch between words of relevant documents and user's query and more serious negative effects it has on the performance of information retrieval, a method of query expansion on the basis of new terms co-occurrence representation was put forward by analyzing the process of producing query. The expansion terms were selected according to their correlation to the whole query. At the same time, the position information between terms were considered. The experimental result on test retrieval conference (TREC) data collection shows that the method proposed in the paper has made an improvement of 5% - 19% all the time than the language modeling method without expansion. Compared to the popular approach of query expansion, pseudo feedback, the precision of the proposed method is competitive.
基金
the High Technology Research and Development Program of China(No.2006AA01Z150)
the National Natural Science Foundation of China(No.60435020)