期刊文献+

搜索引擎中混合型分布式索引组织策略 被引量:1

Hybrid strategy to distributed index organization in search engine
下载PDF
导出
摘要 针对搜索引擎中索引组织策略在查询性能和可扩展性等方面存在的问题,提出了一种混合型分布式索引组织策略(Loc-Glob).该策略整合了局部和全局索引组织的基本思路,首先将搜索引擎系统的索引服务器从逻辑上分为若干个索引服务器池,索引数据先以局部(或全局)索引组织策略分配到索引服务器池上.然后,在索引服务器池的内部,索引继续以全局(或局部)索引组织的方式存储到各索引服务器上.混合型的索引组织策略较局部和全局索引组织策略具有更好的可扩展性.实验结果表明,该策略较全局索引组织策略在查询性能、负载均衡方面都有所提升,与局部索引组织策略的查询性能基本相当,并具备较高的负载均衡水平. A hybrid index organization strategy named Loc-Glob was proposed to enhance the query performance and scalability in search engine. Loc-Glob integrates two welt-studied index partitioning schemes, which are widely used in search engines. Firstly, index is partitioned according to local (or global) index organization strategy, taking cluster of some index servers as a single machine. Then, index distributed to certain cluster are further partitioned to index servers according to the global (or local) index organization strategy inside the cluster. Loc-Glob is more scalable than the traditional strategies to accom- modate the explosively growing web pages. Experimental results indicate that the throughput of Loc-Glob outperforms the global index organization while it is very close to the local index organization, and Loc- Glob provides good load-balancing level.
出处 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2009年第8期1361-1366,共6页 Journal of Zhejiang University:Engineering Science
基金 国家"973"重点基础研究发展规划资助项目(2006CB303000)
关键词 搜索引擎 倒排索引 分布式索引组织 查询性能 负载均衡 search engine inverted index distributed index organization query performance load balancing
  • 相关文献

参考文献16

  • 1RIBEIRO-NETO B, BARBOSA R. Query performance for tightly coupled distributed digital libraries [C] // Proceedings of 3rd ACM Conference on Digital Libraries. Pittsburgh: ACM, 1998: 182-190.
  • 2MAC A, MCCANN J A, ROBERTSON S E. Parallel search using partitioned inverted files [C]// Proceedings of 7th International Symposium on String Processing and Information Retrieval. A Coruna: IEEE, 2000:209-220.
  • 3BARROSO L A, DEAN J, HOLZLE U. Web search for a planet: the google cluster architecture [J]. IEEE Micro, 2003, 23(2) : 22 - 28.
  • 4MELNIK S, RAGHAVAN S, YANG B, et al. Building a distributed full-text index for the web [C]// Proceedings of the lOth International Conference on World Wide Web. Hong Kong: ACM, 2001:396 - 406.
  • 5BUTTCHER S, CLARKE C L A, LUSHMAN B. Hybrid index maintenance for growing text collections [C] // Proceedings of the 29th ACM SIGIR Conference on Research and Development in Information Retrieval. Seattle: ACM, 2006 : 356 - 363.
  • 6LESTER N, MOFFAT A, ZOBEL J. Fast on-line index construction by geometric partitioning [C]// Proceedings of the 14th ACM International Conference on Information and Knowledge Management. Bremen: ACM, 2006:776 - 783.
  • 7JEONG B S, OMIECINSKI E. Inverted file partitioning schemes in multiple disk systems [J]. IEEE Transactions on Parallel and Distributed Systems, 1995, 6 (2): 142 - 153.
  • 8ZOBEL J, MOFFAT A. Inverted files for text search engines [J]. ACM Computing Surveys, 2006, 38 (2) : Article 6.
  • 9BADUE C, RIBEIRO-NETO B, BAEZA-YATES R, et al. Distributed query processing using partitioned inverted files [C]// Proceedings of 8th International Sym- posium on String Processing and Information Retrieval. Manaus: IEEE, 2001: 10-20.
  • 10TOMASIC A, GARCIA-MOLINA H. Performance of inverted indices in shared-nothing distributed text document information retrieval systems [C]// Proceedings of 2nd International Conference on Parallel and Distributed Information Systems. San Diego: IEEE, 1993:8 - 17.

同被引文献13

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部