摘要
在处理海量数据的系统中,分布式系统是很好的解决方案,对海量级的数据进行查询和检索建立索引是必要的。针对传统索引的创建和维护效率不高的情况,设计了一种基于Hadoop的分布式索引集群的解决方案。利用Hadoop的分布式存储和计算能力,采用基于DHT(Distributed Hash Table)的分布式索引算法,将操作分散到分布式索引集群的各个节点上进行并行处理以提高数据的查询和检索效率。
Distributed system is a good solution for deal with the mass data,index is necessary for mass-level data query and retrieval.Ac cording to the traditional index to create and maintain efficiency is not high,design of a distributed indexing of cluster solution based on Hadoop.Use of the Hadoop distributed storage and computing capacity,and based on DHT(Distributed Hash Table) Distributed index al gorithm,to make data inqury and retrival more efficient,by parallel processing separately on each node of the distributed index cluster.
出处
《电脑知识与技术(过刊)》
2011年第12X期9043-9044,共2页
Computer Knowledge and Technology