摘要
针对HDFS的内部数据下载效率较低和可能出现的负载不均衡的问题进行了研究,从分布式文件整体下载效率和数据块的下载效率两方面提出了优化方法。实验结果表明:两个方法都能提高效率,但在集群有大量DataNode的前提下,两者结合起来的方法能更好地提高下载效率和均衡DataNode的负载。
Concerning the problems such as low downloading efficiency and imbalanced load of DataNode in Hadoop Distributed File System HDFS.Inthis paper two methods to optimize were proposed one was to improve the whole process of downloading a file the other was to optimize the downloading a block by a parallel download algorithm for dynamically allocating load by speed.Mathematical analysis and experiments prove that two methods can enhance the efficiency.Meanwhile by combining the two methods downloading is more efficient and more stable when the load of DataNode can be balanced to some extent.
出处
《计算机应用》
CSCD
北大核心
2010年第8期2060-2065,2240,共7页
journal of Computer Applications