期刊文献+

面向三类查询意图歧义性的查询表达式自动识别研究 被引量:3

Query Classification of Three Types of Ambiguity Intent
原文传递
导出
摘要 【目的/意义】针对查询意图歧义性自动识别,探讨特征有效性及采用不同分类算法识别三类查询意图歧义性的分类准确率,以期对后续研究提供借鉴与指导。【方法/过程】首先提出了一个面向查询意图歧义性的查询表达式分类体系;随后,构建了查询表达式特征及相关文档特征共六类;最后,分别采用决策树算法、神经网络算法及k最邻近算法,探讨采用不同特征组合的有效性及不同分类算法的分类准确率。【结果/结论】(1)分类准确率较基准实验提升比例为49.5%;(2)使用查询表达式特征分类优于使用相关文档特征;(3)决策树的分类准确率略高于其他两类分类算法。【创新/局限】构建了一个面向查询意图歧义性的查询分类体系;完成了面向三类查询意图歧义性的分类任务;然限于数据集获取途径,仅对200数据验证。 【Purpose/significance】This paper investigates the effectiveness of classification features and compares the performance of three classifiers in a query ambiguity intent classification task.【Method/process】This paper first constructs a query taxonomy of ambiguity and then extracts query-based features and document-based features.Later,it tests accuracy,using decision tree,neural network,k-nearest neighbor individually,with various combinations of features.【Result/conclusion】(1)An accuracy is increased by 49.5%compared with the baseline;(2)Compared with document-based features,using query-based features achieves better accuracy;(3)Decision tree performs best among the tested classifiers.【Innovation/limitation】A query taxonomy of ambiguity is constructed;A query classification task based on three types of ambiguity is realized;Due to dataset accessibility,our experiments are done on a limited size dataset.
作者 桂思思 徐健 GUI Si-si;XU Jian(Nanjing Agricultural University,Nanjing 210095,China)
出处 《情报科学》 CSSCI 北大核心 2021年第11期90-95,共6页 Information Science
基金 国家社会科学基金青年项目“面向学术搜索的查询意图研究”(19CTQ023)。
关键词 查询意图 歧义性 自动分类 特征构建 效果测评 query intent ambiguity intent query classification feature-engineering evaluation
  • 相关文献

参考文献6

二级参考文献130

  • 1周钦强,孙炳达,王义.文本自动分类系统文本预处理方法的研究[J].计算机应用研究,2005,22(2):85-86. 被引量:15
  • 2Song R, Luo Z, Nie J Y, Yu Y, Hon H W. Identification of ambiguous queries in Web search. Information Processing and Management, 2008, 45(2): 216-229.
  • 3Dou Z, Song R, Wen J R. A large-scale evaluation and analysis of personalized search strategies. In Proc. the 16th International Conference on World Wide Web (WWW2007), Banff, Canada, May 8-12, 2007, pp.581-590.
  • 4Sanderson M. Ambiguous queries: Test collections need more sense. In Proc. the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2008), Singapore, July 20-24, 2008, pp.499- 506.
  • 5Radlinski F, Dumais S. Improving personalized Web search using result divcrsification. In Proc. the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), Seattle, USA, Aug. 6-11, 2006, pp.691-692.
  • 6Li Y, Zheng Z, Dai K. KDD CUP-2005 report: Facing a great challenge. SIGKDD Explor. Newsl., 2005, 7(2): pp.91-99.
  • 7Vapnik V N. Principles of Risk Minimization for Learning Theory. Advances in Neural Information Processing Systems 4, Morgan Kaufmann, 1992, pp.831-838.
  • 8Mihalcea R, Pedersen T. Advances in word sense disambiguation. In Tutorials at the 20th National Conference on Artificial Intelligence, Pittsburgh, USA, July 9-13, 2005.
  • 9Krovetz R, Croft B W. Lexical ambiguity and information retrieval. ACM Trans. Inf. Syst. 1992, 10(2): 115-141.
  • 10Voorhees E M. Using WordNet to disambiguate word senses for text retrieval. In Proc. the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1993), Pittsburgh, USA, June 27-July 1, 1993, pp.171-180.

共引文献42

同被引文献25

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部