摘要
针对现有分布式文件系统处理海量小文件时存在的主节点元数据处理性能瓶颈问题,提出采用分布式文件来存储元数据,并通过元数据缓冲和Hash映射实现元数据的分布;采用Map Reduce并行程序对元数据检索进行了实现,并指出了并行检索中存在的问题,提出采取局部位图索引对元数据检索进行了优化.最后通过实验进行了验证,实验结果证明,该方法实现了海量元数据的分布式存储与检索,避免了现有分布式文件系统在处理海量小文件时存在的主节点单点性能瓶颈.
For the bottleneck performance on master node metadata processing when the current distributed file systems processing the massive small files, this paper proposes using the distributed file to store metadata and implement the distribution of metadata through its buffer and Hash mapping, and using the MapReduee parallel program to search the metadata and have its implementation, points out the existing problems of parallel retrieval and optimizes the metadata retrieval by using local map index, and finally, carried out a test by experiments. Experimental results demonstrate that this proposed method can implement the distributed storage and retrieval of massive metadata, and avoid the single point bottleneck performance on master node when using the existing distributed file system to process massive small files.
出处
《空军预警学院学报》
2014年第6期427-431,共5页
Journal of Air Force Early Warning Academy
关键词
海量小文件
元数据
分布存储
并行检索
massive small files
metadata
distributed storage
parallel retrieval