期刊文献+

一种平衡数据读写开销的数据复制方法

A replication method with balancing read and update overhead
原文传递
导出
摘要 大型分布式系统通常将系统内存储的数据复制到多个节点以减少数据访问的时间开销.然而,随着数据副本数量的增加,副本数据更新过程的写代价也随之增加.如何合理地选择数据副本的存储节点、控制副本数量,以平衡数据的读写开销,进而有效地降低系统总的数据访问代价是分布式存储的研究热点.针对这一问题,本文提出了一种基于遗传算法的数据复制方法来平衡数据的读写开销.具体地本文对遗传算法进行了以下两方面改进:(1)建立了一个综合考虑读写数据传输代价的评价函数,以控制遗传算法的收敛方向,搜索数据副本存放位置的最优或次优策略;(2)通过时间序列预测方法来启发式地指导染色体变异操作,以合理控制副本数量适应数据的读写访问趋势.实验表明,与传统方法相比,本方法能够更有效地降低数据访问的总时间代价. Big distributed systems usually reduce data access time by replicating data to many servers. However, update overhead increases with replica number increases. The hot topic of distributed storage is how to choose replicas' placement and control replica number to balance read and update cost, thus reducing the access overhead of distributed system. To solve this problem, this article proposes a data replication strategy based on the Genetic Algorithm to balance read and update overhead. Specifically, this article improves Genetic Algorithm in the two following aspects: 1. building an evaluation function by considering read and update overhead to control the convergence direction of the Genetic Algorithm for finding the best or the suboptimal replica placement. 2. directing chromosomal mutation heuristically by time series forecasting method to control replica number to adapt to the trend of reading and updating data. Experiment results show that this method can efficiently reduce the whole time overhead of access- ing data.
出处 《四川大学学报(自然科学版)》 CAS CSCD 北大核心 2013年第1期56-60,共5页 Journal of Sichuan University(Natural Science Edition)
基金 国家自然科学基金(61173159)
关键词 数据复制 读写开销 分布式系统 副本数量 data replication, read and update overhead, distributed system, replica number
  • 相关文献

参考文献15

  • 1Amazon. Amazon web service [DB/OL]. (2011-12- 22). [2012-9-29]. http://aws, amazon, com.
  • 2Microsoft. Windows azure [DB/OL]. (2011-12-25) [2012-10-2], http ://www. windowsazure, com.
  • 3Ranganathan K, Foster I. Identifying dynamic repli- cation strategies for a high-performance data grids [C]. Denver, USA[s. n.], 2001.
  • 4Chang R S, Chang H P. A dynamic data replication strategy using access-weights in data grids[J]. J Su- percomput, 2008,45:277.
  • 5Zhang J W, Lee B S, Tang X Y,et al. A model to predict the optimal performance of the Hierarchical Data Grid[J]. Future Gener Comput Syst, 2010, 26:1.
  • 6TangM, Lee B S, Yeo C K, et al. Dynamic replica- tion algorithms for the multi-tier data grid[J]. Fu- ture Gener Comput Syst, 2005, 21:775.
  • 7Zaman S, Grosu D. A distributed algorithm for the replica placement problem[J]. IEEE Transaction on Parallel Distr Syst, 2011, 22:1455.
  • 8Nukarapu D T, Tang B, Wang L Q, etal. Data rep- lication in data intensive scientific applications with performance guarantee [J ]. IEEE Transaction on Parallel Distr Syst, 2011,22 : 1299.
  • 9Tang M, Lee BS, Tang X Y, etal. The impact of data replication of job scheduling performance in the data grid[J]. Future Gener Comput Syst, 2006, 22: 254.
  • 10Bsoul M, A1-Khasawneh A, Kilani Y, et al. A threshold-based dynamic data replication strategy [J]. J Supercomput, 2012, 60(3): 301.

二级参考文献6

  • 1柏银,李志蜀,朱兴东.MD5算法及其在远程身份认证中的应用[J].四川大学学报(自然科学版),2006,43(2):305-309. 被引量:19
  • 2Elrod R.So you think you have a good business recovery plan?-steps an asset management company can take to recovery form a major disaster[EB/OL].(2005-08-25).[2009-4-11].http://www.infosecwriters.com/textresources/pdf/GoodBusinessRecoveryPlan.pdf.
  • 3SNIA.The 2008 dictionary of storage networking terminology[EB/OL].(2008-06-18).[2009-4-11].http://www.snia.org/education/dictionary/SNIADictionaryEH2008.pdf.
  • 4Shah B.Disk performance of copy-on-write snapshot logical volumes[D].British Columbia:The University of British Columbia,2006:2.
  • 5Mark E,Russinovich D A.Microsoft windows internals fourth edition,pan aimin translated[M].BeiJing:Publishing House of Electronics Industry,2007.
  • 6易固武,刘晓洁,李涛,卢正添,葛亮,周煜.一种网络备份系统的数据一致性检测方法[J].计算机应用研究,2008,25(12):3700-3701. 被引量:6

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部