期刊文献+

中文搜索引擎的搜索结果重合率研究 被引量:2

Study of Results Overlap and Uniqueness Among Major Chinese Web Search Engines
下载PDF
导出
摘要 本文的研究目的是测试主流中文搜索引擎搜索结果之间的重合程度和差异程度。利用一个具有11 171条来自真实用户的提问样本集对百度、谷歌和中国雅虎进行实际测试,发现中文搜索引擎搜索结果之间的差异很大,重合率很低。在全部的第一页搜索结果中,三个引擎中任何一个引擎独有的搜索结果总数占89.34%,任何两个引擎之间重合的搜索结果总数占8.11%,三个引擎重合的搜索结果数量占2.54%。三个引擎前两页搜索结果的重合比例更低。通过和已有的英文搜索引擎重合率测试数据相比较,发现中英文搜索引擎的搜索结果重合率都很低,且很相近。 Based on a large number of queries, this study is to measure the overlap and differences among search results across three most popular Chinese web search engines, which are Baidu, Google. cn and Yahoo. cn. The set of randomly selected user-entered queries has 11 171 queries. Findings show that the percent of total first page results unique to only one of the three web search engines was 89.34 %, shared by two of the three web search engines 8.11%, and shared by all three web search engines 2.54 % . This small degree of overlap shows the significant difference in the way major Chinese web search engines retrieve and rank results as regards to the given queries. The research then compares these results with the overlap and uniqueness across the major English web search engines measured by American scholars before. Comparisons reflect that the overlap across major Chinese web search engines is the same as that across major English ones.
作者 王益明 刘菲
出处 《情报学报》 CSSCI 北大核心 2009年第3期374-381,共8页 Journal of the China Society for Scientific and Technical Information
关键词 搜索引擎 重合率 百度 谷歌 中国雅虎 search engine, overlap, baidu, google, cn, yahoo, cn
  • 相关文献

参考文献19

  • 1Paul Gil.What is "The Invisible Web"?.[OL].[2007-10-20].http:∥netforbeginners.about.com/cs/secondaryweb1/a/secondaryweb.htm.
  • 2CNNIC.2005年中国互联网络信息资源数量调查报告.[R/OL].[2007-10-20].htttp:∥www.cnnic.net.cn/download/2005/20050301.pdf.
  • 3CNNIC.中国互联网络发展状况统计报告(2007年1月).[R/OL].[2007-10-20].http:∥www.cnnic.net.cn/uploadfiles/pdf/2007/2/13/95522.pdf.
  • 4Gulli A,Signorini A.The Indexable Web is More than 11.5 billion Pages.WWW 2005,May 10-14,2005,Chiba,Japan.[C/OL].[2007-10-20].http:∥www.cs.uiowa.edu/~asignori/web-size/size-indexable-web.pdf.
  • 5北京正望咨询有限公司.中国搜索引擎京沪穗用户调查报告.2007.[R/OL].[2007-10-20].http:∥www.sinaimg.cn/IT/focus/2007search/idx/2007/0917/U73P2T52D3140F1999DT20070917151539.pdf.
  • 6Gulli A,Signorini A.Building an Open Source Meta Search Engine.WWW2005,May 10-14,2005,Chiba,Japan.[C/OL].[2007-10-20].http:∥www2005.org/cdrom/docs/p1004.pdf.
  • 7Bharat K,Broder A.A technique for measuring the relative size and overlap of public Web search engines[J/OL].Computer Networks and ISDN Systems,1998(1):379-388.[2007-10-20].http:∥net.pku.edu.cn/~wbia/2005/public-html/papers/webGraph/Estimating%20the%20Relative%20Size%20and%20Overlap%20of%20Public%20Web%20Search%20Engines.pdf.
  • 8John Bailey,et al.Search engine overlaps:Do they agree or disagree?[C/OL].Second International Workshop on Realising Evidence-Based Software Engineering (REBSE'07).[2007-10-20].http:∥ieeexplore.ieee.org/iel5/4273269/4273270/04273274.pdf?arnumber=4273274.
  • 9Dogpile.Different Engines,Different Results 2005[R/OL].[2007-10-20].http:∥comparesearchengines.dogpile.com/OverlapAnalysis.pdf.
  • 10Dogpile.Different Engines,Different Results 2007[R/OL].[2007-10-20].http:∥www.infospaceinc.com/onlineprod/Overlap-DifferentEnginesDifferentResults.pdf.

二级参考文献11

  • 1王建勇,单松巍,雷鸣,谢正茂,李晓明.Web search engine:characteristics of user behaviors and their implication[J].Science in China(Series F),2001,44(5):351-365. 被引量:4
  • 2中国互联网络信息中心 (China Internet Network Information Center,CNNIC),http://www.cnnic.net.cn/
  • 3Baldi P,Frasconi P,Smyth P.Modeling the Internet and the Web,probabilistic methods and algorithms.England:John Wiley,2003
  • 4Xie Yinglian,O'Hallaron D.Locality in search engine queries and its implications for caching.In:Proc.IEEE Infocom.2002
  • 5Silverstein C,Henzinger M,Marais H,et al.Analysis of a very large AltaVista query log.SRC Technical Note,1998-016,1998
  • 6Spink A,Wolfram D,Jansen B J,et al.Searching the web:The public and their queries.Journal of the American Society for Information Science,2001,53 (2):226~234
  • 7北大天网搜索引擎(Tianwang Search Engine).http://e.pku.edu.cn
  • 8Cho J.Crawling the Web:Discovery and Maintenance of a Large-Scale Web Data.[Ph.D.dissertation],Stanford University,2001
  • 9中国Web信息博物馆(Chinese Web Infomall.http://www.infomall.cn/
  • 10Beeferman D,Berger A.Agglomerative clustering of a search engine query log.In:Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2000,407~416

共引文献44

同被引文献11

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部