摘要
针对分布式文件系统的应用存在处理小文件效率不高的问题,提出一种适用于中小规模分布式文件系统集群的应用架构,将传统分布式文件系统集群中的内网划分为两个子网:对外子网和对内子网,对外子网中传输与外网之间的交互数据,对内子网中传输分布件系统集群的管理数据。每个数据结点同时与对外和对内两个子网连接,并代替名称节点负责与外网直接的数据交流;名称节点本身只与对内子网连接。对外子网与外网之间使用防火墙设备加强安全性,并使用负载均衡设备将来自外网的数据请求合理的分配到每个数据节点上;增加了缓存机制对系统处理小文件操作进行优化,部署实验环境,设计一个测试程序对缓存效率测试,对1000个100KB的文件,通过模拟多线程连续读取大量文件来测试缓存的性能,实验证明系统设计方案可行,增加磁盘缓存有利于提高系统处理小文件的存取效率,系统优化效果显著.
Aimed at the low efficiency of distributed file system dealing with small files,we proposed an application structure of small and medium-sized distributed file system cluster,the intranet of which was divided into external subnet and internal subnet.The external subnet was used to transport the exchange data to external network.The internal subnet was used to transport the management data in distributed file system.Every data node was connected to both two subnets to exchange data with external network replacing the name node,while the name node was connected only with internal subnet.The safety was enforced by using firewalls between external subnet and internal subnet.The data requests from the external network were assigned to each data node reasonably through load balancing device.Because of the existence of efficiency problem in small files,we optimized small files operation through adding caching behavior,deploying experimental environment and designing a test program for caching efficiency test.We tested the cache performance by simulating multithreading continuous reading large files using 1000 files of 100KB.Experiments prove that the efficiency of processing small file in system is improved by adding disk buffer and the system optimization effect is remarkable.
出处
《武汉工程大学学报》
CAS
2014年第1期69-73,共5页
Journal of Wuhan Institute of Technology
基金
十二五国家科技重大专项课题子课题(2011ZX05023-005-006)
关键词
缓存
中小规模分布式文件系统
管理数据
cache
small and medium-sized distributed file system
management data