摘要
Recently,we designed a new experimental system MSearch,which is a cross-media meta-search system built on the database of the WikipediaMM task of ImageCLEF 2008.For a meta-search engine,the kernel problem is how to merge the results from multiple member search engines and provide a more effective rank list.This paper deals with a novel fusion model employing supervised learning.Our fusion model employs ranking SVM in training the fusion weight for each member search engine. We assume the fusion weight of each member search engine as a feature of a result document returned by the meta-search engine. For a returned result document,we first build a feature vector to represent the document,and set the value of each feature as the document's score returned by the corresponding member search engine.Then we construct a training set from the documents returned from the meta-search engine to learn the fusion parameter.Finally,we use the linear fusion model based on the overlap set to merge the results set.Experimental results show that our approach significantly improves the performance of the cross-media meta-search(MSearch) and outperforms many of the existing fusion methods.
Recently,we designed a new experimental system MSearch,which is a cross-media meta-search system built on the database of the WikipediaMM task of ImageCLEF 2008.For a meta-search engine,the kernel problem is how to merge the results from multiple member search engines and provide a more effective rank list.This paper deals with a novel fusion model employing supervised learning.Our fusion model employs ranking SVM in training the fusion weight for each member search engine. We assume the fusion weight of each member search engine as a feature of a result document returned by the meta-search engine. For a returned result document,we first build a feature vector to represent the document,and set the value of each feature as the document's score returned by the corresponding member search engine.Then we construct a training set from the documents returned from the meta-search engine to learn the fusion parameter.Finally,we use the linear fusion model based on the overlap set to merge the results set.Experimental results show that our approach significantly improves the performance of the cross-media meta-search(MSearch) and outperforms many of the existing fusion methods.
基金
Project supported by the National Natural Science Foundation of China(No.60605020)
the National High-Tech R&D Program (863) of China(Nos.2006AA01Z320 and 2006AA010105)