摘要
为了搜索引擎能提供高质量检索,提出了一种查询意图自动分类模型。该模型将用户查询分为咨询、学术、资源、服务、导航和热点6类,建立了一套查询意图分类体系;在传统搜索引擎系统中加入查询意图处理模块,通过对用户查询意图的查询词信息(Qi)、点击URL信息(Cu)和基于某分类的URL点击排序(Cr)3个分类特征进行统计分析,提取其特征向量,进而推断出用户查询意图。通过在Sogou数据集上的试验表明,各类信息的查询分类效果F值均大于0.8,取得了较好的分类效果。
To provide high quality search for the search engine, a query intention automatic classification model is proposed. In the model, user query is classified into consultation, academia, resource, service, navigation and hotpot six types, and a suit of query intention classification system is constructed. The query intention process module is added into the traditional search engine system. Three classification features of the user query intention include query information (Qi), clicked URL information (Cu) and sequence of clicked URLs based on a classification (Cr) are counted and analyzed, and the feature vector is extracted. Thus, the user query intention is concluded. The experiment on Sogou dataset shows that each type of information′s value F of query classification effect is more than 0.8. Therefore, the model achieves great classification effect.
作者
杨杰
徐越
余建桥
蒋建华
YANG Jie;XU Yue;YU Jianqiao;JIANG Jianhua(College of Mobile Telecommunications, Chongqing University of Posts and Telecommunications, Chongqing 401520, China;Unit 31102 of PLA, Nanjing 210016, China;School of Computer Engineering, Chongqing College of Humanities, Science & Technology, Chongqing 401524, China)
出处
《指挥信息系统与技术》
2019年第2期74-79,共6页
Command Information System and Technology
基金
国家自然科学基金(61303267)资助项目
关键词
信息搜索
分类
查询意图
支持向量机
information search
classification
query intention
support vector machine