期刊文献+

Web信息检索中主题精选算法的研究与改进 被引量:3

Research and Improvement on Topic Distillation Algorithm in Web IR
下载PDF
导出
摘要 搜索引擎是目前最主要的Web信息检索工具,然而它的效果还不能令人满意。基于Web链接结构的主题精选算法的链接分析迭代往往会收敛于链接图中与查询主题不太相关的紧密交织区域(TKC),从而导致主题偏移。笔者对经典主题精选算法HITS的分析表明该算法还有给不同的Web站点规定了不平等的影响权重以及不能满足用户多粒度的信息需求等缺点。文章在分析主题精选算法研究的基础上针对其不足提出了改进算法g-HITSc,实验表明该算法是合理和有效的。 Search engine is the most commonly used tool for Web information retrieval;however,its current status is still far from satisfaction.Topic distillation algorithm,which is based on Web link structure,is likely to converge at an irrelevant Tightly Knit Community(TKC),thus lead to topic drift.Analysis on the classical algorithm,HITS,shows that such algorithm not only fails to satisfy user's multiple-granularity information needs,but also tends to define unjust in-fluence weights for different authors of Websites.Based on these analysis it puts forward an improved algorithm g-HITSc,experimental results show that the new algorithm is reasonable and effective.
出处 《计算机工程与应用》 CSCD 北大核心 2004年第17期174-178,共5页 Computer Engineering and Applications
基金 国家自然科学基金项目(编号:60173036) 江苏省"十五"高科技项目(编号:BG2001013)资助
关键词 主题精选 HITS 多粒度 链接分析 WEB信息检索 topic distillation,HITS,multiple-granularity,link analysis,Web IR
  • 相关文献

参考文献18

  • 1Search Engine Watch.http://www.searchenginewatch.com.
  • 2Kobayashi M,Takeda KInformation retrieval on the Web[J].ACM Computing Surveys,2000;32(2): 144~173
  • 3Kleinberg J M.Authoritative sources in hyperlinked environment[J].Journal of the ACM, 1999;46(5) :604~632
  • 4Baeza-Yates R,Ribeiro-Neto B.Modern information retrieval[M].New York :Addison Wesley, 1999
  • 5Arasu A,Cho J et al.Searching the Web[J].ACM Transactions on Intemet Technology,2001; 1 (1)
  • 6Henzinger M.Hyperlink analysis for the Web[J].IEEE Internet Computing,2001 ;5 (1) :45~50
  • 7Bharat K,Henzinger M.Improved algorithms for topic distillation in a hyperlinked environment[C].In:Proc of the 21th Annual Int'l ACM SIGIR Conf on Research and Development in Information Retrieval (SIGIR98), Melbourne:ACM, 1998:104~111
  • 8Lempei R,Moran S.SALSA-the stochastic approach for link-structure analysis[J].ACM Transactions on Information Systems,2001;19(2):131~160
  • 9Borodin A,Roberts G O,Rosenthal J S et al.Finding authorities and hubs from link structure on the World Wide Web[C].In:Proc of the
  • 10th Int'l World Wide Web Conferences(WWW 10),Hong Kang:ACM ,2001:415~42910.Amento B,Terveen L,Hill W.Does "authority" mean quality? Predicting expert quality ratings of Web documents[C].In:Proc of the 23rd Annual Int'l ACM SIGIR Conf on Research and Development in Information Retrieval(SIGIR'00),Athens:ACM,2000:296~303

同被引文献67

引证文献3

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部