期刊文献+

一种不依赖访问热度信息的分布式文件放置算法 被引量:2

A Distributed File Placement Algorithm without Depending on Popularity Information
下载PDF
导出
摘要 文件放置问题一直是分布式存储领域的研究热点。分布式文件存储系统HDFS随机选择节点完成文件放置,存在访问负载分布不均衡的缺点。研究人员提出大量基于文件访问热度信息的放置算法;但是,文件的访问热度信息是动态变化的,难以准确预测。提出一种不依赖访问热度信息的分布式文件放置算法;该算法仅使用文件的创建时间信息,利用文件已创建时间与访问热度之间的相关性,首先将时间进行区间划分,然后统计出各节点在不同时间区间内所创建文件的数据量,放置过程中保持同一时间区间的数据量在不同节点间大致相同。实验结果表明,该算法不仅可以使各节点的存储负载达到均衡,还能够提升访问负载的均衡,消除因文件访问热度不均而导致的性能瓶颈。 File placement has always been a research hotspot in the field of distributed storage.The distributed file storage system HDFS places files by randomly selecting nodes,which leads to imbalance in accessing load.Researchers have proposed a large number of placement algorithms based on file popularity.However,file popularity is dynamically changing,and is difficult to accurately predict.A distributed file placement algorithm was proposed without depending on file popularity.According to the creation time of file and the correlation between creation time and file popularity,the algorithm firstly divides the time interval,and then counts the data of each node in different time intervals.It keeps the data of different nodes in the same time interval roughly the same.Experimental results show that the algorithm can balance not only the storage load,but also the access load on each node,and it eliminates the performance bottleneck caused by the uneven distribution of file popularity.
出处 《科学技术与工程》 北大核心 2018年第2期285-289,共5页 Science Technology and Engineering
基金 西安科技大学博士启动基金(2015QDJ031) 陕西省教育厅专项科学研究计划项目(15JK1468)资助
关键词 分布式文件存储系统 文件访问热度 文件放置 负载均衡 distr ibuted f i le storage system f i le p o pu lar ity f i le placement load balance
  • 相关文献

参考文献9

二级参考文献111

  • 1李红莲,王春花,袁保宗,朱占辉.针对大规模训练集的支持向量机的学习策略[J].计算机学报,2004,27(5):715-719. 被引量:53
  • 2钱晔蕾,董健全.基于非结构化P2P的副本技术的研究和应用[J].计算机工程与应用,2007,43(10):148-153. 被引量:4
  • 3王庆波,金滓,何乐,等.虚拟化与云计算[M].北京:电子工业出版社,2009.
  • 4Fuhrmann P,Culzow V.dCache,storage system for the future[C]//Euro-Par 2006 Parallel Processing,2006:1106-1113.
  • 5Ranganathan K,Foster Lldentifying dynamic replication strategies for a lugh-performance data grid[C]//Lecture Notes in Computer Sci-ence,2001,2242:75-86.
  • 6Tang M.Dynamic replication algorithms for the multi-tier data grid[J].Future Generation Computer Systems,2005,21(5):775-790.
  • 7USATLAS[EB/OL].http://www.usatlas.bnl.gov/.
  • 8Nebuloni G. Energy Footprint of the European Server Infra- structure, 2008, and 2009-2013 Forecast [EB/OL]. http:// www. idc. com/getdoc, jsp? containerId=GEllR9,2009-10.
  • 9Yarow J. Videos on Youtube grew 123% year over year, while Facebook grew 239% [EB/OL]. http;//www, strangelove. eom/blog/2010/06 / videos-on-yout ube-grew- 123-year-over-year- while-faeebook-grew-239,2010-06.
  • 10Colarelli D, Grunwald D. Massive Arrays of Idle Disks for Stora- ge Archives [C]//Proc. 2002 ACM/IEEE Conf. Supercompu- ting. Los Alamitos, CA, USA: IEEE, 2002 : 1-11.

共引文献103

同被引文献25

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部