摘要
Web搜索系统往往通过与用户的交互来精化查询以提高搜索性能.除文字之外,网页中还存在着大量其它模态的信息,如图像、音频和视频等.以往对于查询精化的研究很少涉及对多模态信息的利用.文中提出了一种基于半监督学习的多模态Web查询精化方法M2S2QR,将Web查询精化转化为一个机器学习问题加以解决.首先,基于用户判断后的网页信息,分别为不同模态训练相应的学习器,然后利用未经用户判断的网页信息来提高学习器性能,最后将不同模态学习器结合起来使用.实验验证了文中方法的有效性.
Web search systems usually improve search performance by interacting with users to refine queries. In addition to text information, usually a large amount of information of other mo- dalities, such as image, audio and video, exist in Web pages. Few previous researches on Web query refinement, however, try to exploit the multi-modal information. This paper proposes a multi-modal Web search query refinement method M2S2QR based on semi-supervised learning, which transforms Web search query refinement into a machine learning problem. First, based on the information given by Web pages judged by users, classifiers are trained for different modali- ties, respectively. Then, Web pages that have not been judged by users are used to help improve the performance of the classifiers. Finally the classifiers of different modalities are combined to use. Experiments validate the effectiveness of the proposed method.
出处
《计算机学报》
EI
CSCD
北大核心
2009年第10期2099-2106,共8页
Chinese Journal of Computers
基金
国家自然科学基金(60635030
60721002)
江苏省自然科学基金(BK2008018)资助
关键词
机器学习
半监督学习
多模态信息
WEB搜索
查询精化
machine learning
semi-supervised learning
multi-modal information
Web search
query refinement