期刊文献+

分布式检索系统中基于混合模型的多站点融合

Multi-sites fusion based on Gaussian-exponential mixture model in distributed retrieval system
下载PDF
导出
摘要 为提高检索性能,提出将基于高斯分布-指数分布混合模型的融合方法应用于分布式检索系统的多站点融合。该方法利用高斯密度函数和指数密度函数分别描述站点检索结果集合的相关文档和非相关文档的相关分值分布,并用基于混合模型的方法对相关分值进行规范化处理,然后对规范化处理后的相关分值进行合并。该融合方法考虑到了相关文档和非相关文档在分值分布上的差异,使计算出的相关分值更加准确,而且可以为性能比较好的站点分配更高的权重值,以提高整个系统的平均查准率。实验结果表明该方法优于其它融合方法。 In order to increase the retrieval performance,the fusion method based on the mixture mode of Gaussian distribution and exponential distribution is used to combine multi-sites of the distributed retrieval system.It describes the relevance score distribution of the relevant and non-relevant document respectively using the Gaussian density function and the exponential density function.Based on the mixture model,the relevance scores of documents are normalized and combined,The difference of the relevance score distribution between relevant and non-relevant documents is considered in the fusion method,so the relevance score can be counted precisely.A greater weighting can be assigned to the better performance site to increase the retrieval average precision.The experimental results indicate that the mixture fusion method has better performance than other fusion methods.
出处 《计算机工程与应用》 CSCD 北大核心 2008年第1期155-158,共4页 Computer Engineering and Applications
基金 国家自然科学基金(the National Natural Science Foundation of China under Grant No.60475021) 河南省自然科学基金(the Natural Sci-ence Foundation of Henan Province of China under Grant No.2007520013)。
关键词 相关分值 混合模型 多站点融合 relevance score mixture model multi-sites fusion
  • 相关文献

参考文献11

  • 1Cacheda F,Plachouras V,Ounis I.A case study of distributed information retrieval architectures to index one terabyte of text[J].Information Processing & Management,2005,41 (5):1141-1161.
  • 2Croft W B.Combining approaches to information retrieval[M]//Croft W B.Advances in Information Retrieval.[S.l.]:Kluwer Academic Publishers,2002:1-36.
  • 3Montague M,Aslam J.Relevance score normalization for metasearch[C]//the Proc of the ACM Tenth International Conference on Information and Knowledge Management,2001,11:427-433.
  • 4Manmatha R,Rath T,Feng F.Modeling score distributions for combining the outputs of search engines[C]//the Proc of 24th ACM SIGIR Conf on Research and Development in Information Retrieval,2001,9:267-275.
  • 5Sever H,Tolun M R.Comparison of normalization techniques for metasearch[C]//Yakhno T.LNCS 2457:ADVIS 2002:133-143.
  • 6霍华,冯博琴.基于混合模型的多搜索引擎融合[J].西安交通大学学报,2005,39(4):356-359. 被引量:1
  • 7Mclachlan G,Peel D.Finite mixture models[M].New York:John Wiley & Sons,Inc,2001:40-51.
  • 8Dankmar B,Seidel W,Garel B.Advances in mixture models[J].Computational Statistics & Data Analysis,2006,11:151-159.
  • 9Arampatzis A,van Hameren A.Maximum likelihood estimation for filtering thresholds[C]//the Proc of the 24th ACM SIGR Conf on Research and Development in Information Retrieval,Sept 2001:185-293.
  • 10向日华,王润生.一种基于高斯混合模型的距离图像分割算法[J].软件学报,2003,14(7):1250-1257. 被引量:54

二级参考文献16

  • 1Jiang XY, Bunke H. Edge detection in range images based on scan line approximation. Computer Vision and Image Understanding,1999,73(2): 183~ 199.
  • 2Hoover A, Jean-Baptiste G, Jiang XY, Flynn PJ, Bunke H, Goldgof DB, Bowyer K, Eggert DW, Fitzgibbon A, Fisher RB. An experimental comparison of range image segmentation algorithms. IEEE Transactions on PAMI, 1996,18(7):673--689.
  • 3Hoffman R, Jain AK. Segment and classification of range images. IEEE Transactions on PAMI, 1996,9(5):608---620.
  • 4Bihnes JA. A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. 1998. http://ssli.ee.washington.edu/people/bihnes/mypapers/em.ps.gz.
  • 5Redner RA, Walker HF. Mixture density, maximum likelihood and the EM algorithm. SIAM Review, 1984,26(2):195~239.
  • 6Hoover A, Powell MW. Range image segmentation comparison project. Department of Computer Science and Engineering,University of South Florida, 1996. http://marathon.csee.usf.edu/range/seg-comp/SegComp.html.
  • 7Raflery AE. Approximate Bayes factors and accounting for model uncertainty in generalizes linear model. Technical Report, 1993.http://www.stat.washington.edu/www/research/reports/1993/tr255 .ps.
  • 8Fraley C, Raftery AE. How many clusters? Which clustering method? Answers via model-based cluster analysis. Technical Report,1998. http://www.stat.washington.edu/www/research/reports/1998/tr329.ps.
  • 9Buhmann/M. Data clustering and learning. 2002. http://www-dbv.cs.uni-bonn.de,/pdf/buhmann.hobtann02.pdf.
  • 10Savoy J. Combining multiple strategies for effective monolingual and cross-language retrieval [J]. Information Retrieval, 2004, 7(1): 121-148.

共引文献53

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部