期刊文献+

查询专指度特征分析与自动识别 被引量:5

Feature Analysis and Automatic Identification of Query Specificity
原文传递
导出
摘要 【目的】基于Sogou查询日志构建人工标注集,实现查询专指度的特征分析与自动识别,并对识别效果进行分析与评测。【方法】选取用户查询串基本特征与内容特征进行统计分析,并分别训练决策树、SVM和朴素贝叶斯分类器对专指度进行自动识别。【结果】使用以上特征的识别效果良好,十折交叉检验的宏平均F-measure均高于0.8。【局限】分类特征的选择未考虑用户点击信息;朴素贝叶斯的独立性假设在本实验中是否可以忽略仍需进一步验证。【结论】利用查询串基本特征和内容特征,可以有效识别弱、略和强专指度查询。 [Objective] This paper constructs a human-annotated collection on the basis of Sogou query logs, aims at feature analysis and automatic identifcation of query specificity, as well as evaluates and compares the identifing results. [Methods] The queries' basic features and content features are selected and analyzed. And then the decision tree, SVM and Naive Bayes classifiers are built and trained to achieve the automatic query specificity classification. [Results] Using the features mentioned above, an effective query specificty identification is obtained. Finally, the macro average F-measures of the identification effects are all above 0.8. [Limitations] Users' clickthrough information is not selected during the feature selection, and the ignorance of the conditional independence assumption of the Naive Bayes classifier in this particular experiment should be further verified. [Conclusions] The queries' basic features and content features, by themselves, can well distinguish broad, medium, and specific queries.
出处 《现代图书情报技术》 CSSCI 2015年第2期15-23,共9页 New Technology of Library and Information Service
基金 国家科技支撑计划课题"文化遗产知识本体构建存储可视化技术研究"(项目编号:2012BAH33F03) 国家自然科学基金面上项目"基于语言模型的通用实体检索建模及框架实现研究"(项目编号:71173164)的研究成果之一
关键词 查询专指度 决策树 SVM 朴素贝叶斯 Query specificity Decision tree SVM Naive Bayes
  • 相关文献

参考文献33

  • 1comScore, Inc. Global Search Market Draws More than 100Billion Searches per Month [R/OL]. (2009-08-31). [2014-01-11]. http://www.comscore.com/Insights/Press一Releases/2009/8/Global_Search_Market_Draws_More_than_100_Billion_Searches_per一 Month.
  • 2Gonzalez-Caro C, Calderon-Benavides L, Baeza-Yates R, et al.Web Queries: The Tip of the Iceberg of the User’s Intent [C].In: Proceedings of the 4th ACM WSDM Conference, HongKong, China. 2011.
  • 3Nguyen B V,Kan M. Functional Faceted Web Query Analysis[C]. In: Proceedings of the 16th International Conference onWorld Wide Web. ACM,2007.
  • 4Song R, Luo Z, Wen J, et al. Identifying Ambiguous Queriesin Web Search [C]. In: Proceedings of the 16th InternationalConference on World Wide Web. New York: ACM, 2007:1169-1170.
  • 5Broder A. A Taxonomy of Web Search [J]. ACM SIGIRForum, 2002, 36(2): 3-10.
  • 6Rose D E, Levinson D. Understanding User Goals in WebSearch [C]. In: Proceedings of the 13th InternationalConference on World Wide Web. New York: ACM, 2004:13-19.
  • 7Donato D, Donmez P, Noronha S. Toward a DeeperUnderstanding of User Intent and Query Expressiveness[C].In: Proceedings of ACM SIGIR for Query Representation andUnderstanding Workshop. ACM, 2011.
  • 8Chang Y, He K,Yu S, et al. Identifying User Goals from WebSearch Results [C]. In: Proceedings of IEEE/WIC/ACMInternational Conference on Web Intelligence (WF06). IEEE,2006: 1038-1041.
  • 9Calderon-Benavides L, Gonzalez-Caro C, Baeza-Yates R.Towards a Deeper Understanding of the User’s Query Intent[C]. In: Proceedings of the SIGIR 2010 Workshop on QueryRepresentation and Understanding. 2010:21-24.
  • 10Song R, Luo Z, Nie J, et al. Identification of AmbiguousQueries in Web Search [J], Information Processing &Management, 2009, 45(2): 216-229.

二级参考文献30

  • 1张静,王建民,何华灿.基于属性相关性的属性约简新方法[J].计算机工程与应用,2005,41(28):55-57. 被引量:18
  • 2Rish LAn empirical study of the naive Bayes classifier[C]//Proceedings of UCAI Workshop on Empirical Methods in Artificial Intelligence, 2001.
  • 3Hand D J,Yu K.Idiot's Bayes-not so stupid after all?[J].Intemational Statistical Review, 2001 (69) : 388-369.
  • 4Domingos P,Pazzani M.On the optimality of the simple Bayesian classifier under zero-one loss[J].Machine Learning, 1997 ( 29 ) : 103-130.
  • 5Merz C J,Murphy P M.UCI repository of machine learning datasets [EB/OL].http://www.ics.uci.edu/mlearn LRepository.html.
  • 6Metzler D, Jones R, Peng F, et al. Improving Search Relevance for Implicitly Temporal Queries[ C ]. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2009.
  • 7Kanhabua N, Norvag K. Determining Time of Queries for Re - ranking Search Results[ C ]. In : Proceedings of the 14th European Conference on Research and Advanced Technology for Digital Librar- ies. 2010:261 -272.
  • 8Saracevic T. Relevance : A Review of the Literature and a Frame- work for Thinking on the Notion in Information Science. Part II : Nature and Manifestations of Relevance [ J ]. Journal of the Ameri- can Society for Information Science and Technology,2007,53 ( 13 ) : 1915 - 1933.
  • 9Diaz F, Jones R. Using Temporal Profiles of Queries for Precision Prediction [ C ]. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2004.
  • 10Nune S, Ribeiro C, David G. Using Neighbors to Date Web Docu- ments[ C ]. In : Proceedings of the 9th Annual ACM International Workshop on Web Information and Data Management. 2007:129 - 136.

共引文献43

同被引文献20

引证文献5

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部