基于查询聚类的排序学习算法被引量：6

Learning to Rank Based on Query Clustering

导出

摘要排序学习算法作为信息检索与机器学习的一个交叉领域,越来越受到人们的重视.然而,几乎没有排序学习算法考虑到查询差异的存在.文中查询被建模为多元高斯分布,KL距离被用来度量查询之间的距离,利用谱聚类方法对查询进行聚类,为每个聚类类别训练一个排序函数.实验结果表明经过聚类得到的排序函数需要较少的训练样例,但是它的性能却和没有经过聚类得到的排序函数具有可比性,甚至优于后者. Learning to rank,the interdisciplinary field of information retrieval and machine learning,draws increasing attention and lots of models are designed to optimize the ranking functions.However,few methods take the differences among the queries into account.In this paper,the queries are modeled as multivariate Gaussian distributions and Kullback-Leibler divergence is adopted as distance measure.The spectral clustering is applied to cluster the queries into several clusters and a ranking function is learned for each cluster.The experimental results show that the ranking functions with clustering are trained with less data,but are comparable to or even outperform the ones without clustering.

作者花贵春张敏刘奕群马少平茹立云

机构地区智能技术与系统国家重点实验室清华信息科学与技术国家实验室(筹) 清华大学计算机科学与技术系

出处《模式识别与人工智能》 EI CSCD 北大核心 2012年第1期118-123,共6页 Pattern Recognition and Artificial Intelligence

基金国家自然科学基金(No.60736044 60903107 61073071) 高等学校博士学科点专项科研基金(No.20090002120005)资助项目

关键词排序学习排序函数谱聚类 Learning to Rank Ranking Function Spectral Clustering

分类号 TP391.3 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献17

1Duh K,Kirchhoff K.Learning to Rank with Partially-Labeled Data//Proc of the31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Singapore,Singapore,2008:251-258.
2Broder A.A Taxonomy of Web Search.ACM SIGIR Forum,2002,36(2):3-10.
3Rose D E,Levinson D.Understanding User Goals in Web Search//Proc of the13th International Conference on World Wide Web.New York,USA,2004:13-19.
4Gravano L,Hatzivassiloglou V,Lichtenstein R.Categorizing Web Queries According to Geographical Locality//Proc of the20th International Conference on Information and Knowledge Management.New Orleans,USA,2003:325-333.
5Shen Dou,Sun Jiantao,Yang Qiang,et al.Building Bridges for Web Query Classification//Proc of the29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Seattle,USA,2006:131-138.
6Liu Yiqun,Zhang Min,Ru Liyun,et al.Automatic Query Type Identification Based on Click through Information//Proc of the3rd Asia Information Retrieval Symposium.Singapore,Singapore,2006:593-600.
7Lee U,Liu Zhenyu,Cho J.Automatic Identification of User Goals in Web Search//Proc of the14th International Conference on World Wide Web.Chiba,Japan,2005:391-400.
8Craswell N,Hawking D,Robertson S.Effective Site Finding Using Link Anchor Information//Proc of the24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New Orleans,USA,2001:250-257.
9Westerveld T,Kraaij W,Hiemstra D.Retrieving Web Pages Using Content,Links,URLs and Anchors//Proc of the10th Text Retrieval Conference.Gaithersburg,USA,2001:663-672.
10Kang I,Kim G.Query Type Classification for Web Document Retrieval//Proc of the26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Toronto,Canada,2003:64-71.

同被引文献81

1张娜,张化祥.基于超链接和内容相关度的检索算法[J].计算机应用,2006,26(5):1171-1173. 被引量：6
2白曦,吕晓枫,孙吉贵.基于加权向量空间模型的网络搜索[J].计算机应用研究,2007,24(2):51-53. 被引量：7
3吴佳金,杨志豪,林原,等.基于改进Pairwise损失函数的排序学习方法[C]//第六届全国信息检索学术会议论文集,2010.
4马帅,李佳,刘旭东,等.图查询:社会计算时代的新型搜索[J].中国计算机学会通讯,2012,8(11):26-32.
5Li G,Ooi BC,Feng J,et al.Ease:An effective 3-in-1 keyword search method for unstructured, semi-slructured and structured data[ A]. Proceedings of the ACM SIGMOD International Con- ference on Management of Data [ C ]. New York: Association for Computing Machiner, 2008.3 - 914.
6Aggarwal CC, Wang H. Managing and Mining Graph Data [ M ]. New York: Springer- Verlag, 2010.249 - 274.
7Zhong Ming, Liu Mengchi. Ranking the answer trees of graph search by both structure and content [ A ]. Proceedings of 1st Joint International Workshop on Entity-Oriented and Semantic Search, JIWES' 12- Co-localed with the 35th ACM SIGIR Con- ference[ C ]. New York: Association for Computing Machiner, 2012.344 - 350.
8Bhalotia G,Nakhe C,Hulged A,et al. Keyword searching and browsing in databases using BANKS [ A ]. Proceedings of the International Conference on Data Engineering[ C]. Washington: [FEE Computer Society,2002.431- 440.
9Ding Bolin, Yu Jeffrey Xu, Wang Shan, et al. Finding top-krain-cost connected trees in databases[A]. Proceedings of the International Conference on Data Engineering[C ]. New Jersey: WEE Computer Society, 2009.836 - 845.
10Golenberg K, Kimeffeld B, Sagiv Y. Keyword proximity search in complex data graphs[ A]. Proceedings of the ACM SIGMOD International Conference on Management of Data [ C]. New York: Association for Computing Machiner,2008.927 - 940.

引证文献6

1刘娜,路莹,唐晓君,王海文,李明霞.自动确定单词-文档谱聚类最佳聚类数目的研究[J].小型微型计算机系统,2014,35(3):610-614. 被引量：2
2杨书新,徐丽萍,夏小云,徐慧琴.图数据关键词查询研究进展[J].电子学报,2014,42(11):2260-2267. 被引量：3
3胡正平,武丽丽,李朝辉.交通场景中采用有监督序学习拥挤度排序算法[J].信号处理,2014,30(12):1464-1472.
4邓晓军,满君丰,欧阳旻.基于K武装决斗土匪问题的排序器在线评估算法[J].计算机工程,2015,41(9):271-275.
5林原,徐博,孙晓玲,林鸿飞,许侃.基于似然损失函数的组样本排序学习方法[J].模式识别与人工智能,2017,30(3):235-241. 被引量：1
6赵龙文,莫荔媛,黄跃萍.基于结构和属性特征的政府开放数据检索方法研究[J].情报杂志,2017,36(5):148-152. 被引量：12

二级引证文献18

1王盟燏,王常珏,李玉海.2011-2020年我国政府数据开放研究态势分析[J].知识管理论坛,2022(3):286-298.
2王超,宋文爱,富丽贞,张晶亮.电子病历的检索和结果多样化算法研究[J].科学技术与工程,2016,16(36):190-195. 被引量：2
3周鹏程,施欢欢,钱钢.基于MapReduce的关系数据库关键词查询技术[J].苏州科技大学学报（自然科学版）,2017,34(3):64-70. 被引量：1
4东方.国内外政府数据开放平台调查与分析[J].现代情报,2017,37(10):93-98. 被引量：17
5崔景洋.图数据挖掘研究[J].太原师范学院学报（自然科学版）,2018,17(1):38-40. 被引量：3
6王慧娟,王勇.分布式数据库用户兴趣信息优化检索仿真[J].计算机仿真,2018,35(6):422-425. 被引量：1
7卢建云,朱庆生,吴全旺.一种启发式确定聚类数方法[J].小型微型计算机系统,2018,39(7):1381-1385. 被引量：7
8文世敏.网络数据库特定数据检索算法研究[J].电脑知识与技术,2018,14(6):7-9.
9张祥合.分布式文献数据库需求信息自适应检索仿真[J].计算机仿真,2018,35(9):409-412. 被引量：2
10谷钰,张丽杰,吕翘楚.电子商务交易中多用户相似货源信息检索仿真[J].计算机仿真,2018,35(10):472-476. 被引量：2

1程舒杨,熊锦华,公帅,程学旗.基于内容和用户行为的查询聚类[J].中文信息学报,2016,30(2):121-127. 被引量：4
2花贵春,张敏,刘奕群,马少平,茹立云.面向排序的基于查询需求的查询聚类模型[J].计算机研究与发展,2012,49(11):2407-2413. 被引量：2
3张梅,段建勇,徐骥超.基于网络日志的知识地图构建与应用[J].图书情报工作,2012,56(18):117-121. 被引量：4
4李超,王文清.基于查询聚类的用户个性化检索提示方法[J].中国电子商情（通信市场）,2011(2):121-128.
5钟勇,林冬梅,秦小麟.一种基于DBMS的无监督异常检测算法及其应用[J].计算机科学,2007,34(1):123-127. 被引量：2
6贾荣飞,金茂忠,王晓博.基于用户查询日志的查询聚类[J].北京航空航天大学学报,2010,36(4):500-503. 被引量：4
7杨河彬,贺樑,杨静.一种融入用户点击模型Word2Vec查询词聚类[J].小型微型计算机系统,2016,37(4):676-681. 被引量：4

模式识别与人工智能

2012年第1期

浏览历史

内容加载中请稍等...

基于查询聚类的排序学习算法被引量：6

参考文献17

同被引文献81

引证文献6

二级引证文献18

相关作者

相关机构

相关主题

浏览历史

基于查询聚类的排序学习算法 被引量：6

参考文献17

同被引文献81

引证文献6

二级引证文献18

相关作者

相关机构

相关主题

浏览历史

基于查询聚类的排序学习算法被引量：6