摘要
对用户输入的查询请求,如果搜索引擎系统能给出一个相关查询列表,将有助于用户进行查询修正,进而检索到用户所需要的信息.文中提出了一种利用支持向量回归确定相关W eb查询的新方法.对一个给定的W eb查询,首先从用户的使用记录中抽取候选查询的5个量化指标:被查询的次数、被查询的用户量、用户在反馈结果中的点击次数、与给定查询间的共有词项个数和点击相同网址(URL)的个数;然后用手工标记部分训练数据,进而建立支持向量回归模型,根据相关度的大小确定相关W eb查询.实验结果表明该方法具有较高的准确度.
When a user submits a Web query to a search engine, it is helpful for the user to modify the query and find the needed information if the system returns a list of related Web queries. This paper presents a new determination method of related Web queries using support vector regression, In this method, five quantified indexes of a candidate query are extracted from the log files, including the submitted number of the candidate query, the total numbers of submitting the candidate query and hitting the returned result, the number of common terms and the number of hitting common URL ( Uniform Resource Locator) between the candidate query and the given query. The obtained candidate queries are then ranked based on support vector regression models learned from parts of humanlabeled training data. The related Web queries are finally determined according to the relevance, Experimental resuits show that the proposed method is of high prediction precision.
出处
《华南理工大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2006年第6期74-78,94,共6页
Journal of South China University of Technology(Natural Science Edition)
基金
国家自然科学基金资助项目(60573166)
国家自然科学基金重点资助项目(60435020)