摘要
在分布式文件系统中,小文件的管理一般存在访问性能较差和存储空间浪费较大等缺点.为了解决这些问题,提出了一种性能优化的小文件存储访问(SFSA)策略.SFSA将逻辑上连续的数据尽可能存储在物理磁盘的连续空间,使用Cache充当元数据服务器的角色并通过简化的文件信息节点提高Cache利用率,提高了小文件访问性能;写数据时聚合更新数据及其文件夹域中的相关数据为一次I/O请求写入,减少了文件碎片数量,提高了存储空间利用率;文件传输时利用局部性原理,提前发送批量的高访问率的小文件,降低了建立网络连接开销,提升了文件传输性能.理论分析和实验证明,SFSA的设计思想和方法能有效地优化小文件的存储访问性能.
In distributed file system, the management of small file storage access has encountered some problems, such as poor access performance, low disk space utilization rate, high file transfer delay, etc. To solve these problems, this paper proposes a strategy of small file storage access (SFSA) with performance optimization. SFSA can try to store logical continuous data on continuous space of physical disks as far as possible, and use a cache to act as metadata server and improve utilization rate of cache by using simplified file information node. Therefore it can improve the performance of small file storage access. In order to solve the problem of low disk space utilization rate, SFSA still uses a method of writing optimization which combines the dirty data with its related data in file folder domain into a single I/O request, so, it can reduce the number of file fragments. In addition, according to the principle of data locality, we also propose a method which sends the highly accessed small flies ahead of time. It reduces the overhead of network connection and improves the file transfer performance. Theoretical analysis and experimental results show that the design idea and method of SFSA strategy can improve the performance of small file storage access effectively.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2012年第7期1579-1586,共8页
Journal of Computer Research and Development
基金
国家自然科学基金项目(60573145)
教育部高等学校博士学科点专项科研基金项目(200805610019)
广州市科技计划应用基础资助基金项目(2010Y1-C681)
关键词
分布式文件系统
小文件存储
小文件存储访问
块
优化
访问性能
distributed file system
small file storage
small file storage access (SFSA)
block
optimization
access performance