摘要
从Hadoop分布式文件系统的架构出发,对Name Node节点存在的单点问题进行了分析与研究。在这个前提下,针对单点内存瓶颈问题,提出了一个小文件归并算法。此算法以Hadoop为基础,利用Hadoop分布式文件系统的特点,将归并后生成的大文件序列化到Hadoop分布式文件系统,很好地解决了小文件过多时Name Node单点内存瓶颈问题,并提高了系统的性能和可靠性。
Starting from the architecture of the Hadoop distributed file system,the problems of single point existed in the NameNode are analyzed and studied in this paper.Based on this,a small file merging algorithm is proposed for the single point memory bottlenecks.This algorithm,with Hadoop as the foundation and by the features of Hadoop distributed file system,serializes the generated large file to the Hadoop distributed file system,solves the problem of the single point memory bottlenecks,and improves the performance and reliability of the system.
出处
《软件工程师》
2014年第12期9-10,6,共3页
Software Engineer