基于模糊预测的数据复制优化模型的研究

Research on Data Replication Optimization Model Based on Fuzzy Forecasting

下载PDF

导出

摘要云数据处理系统中广泛采用了多数据副本复制技术,以防止数据丢失,如果数据复制的份数或位置不当,就会引起数据的可用性小于用户期望的数据可用性或存储空间的浪费(如复制份数过多)。针对该问题,经研究提出了一种基于模糊预测的数据复制优化模型,该模型由模糊预测模块和复制优化模块组成。模糊预测模块以节点信息(CPU信息、节点带宽信息、内存信息和硬盘信息)作为输入,预测出节点的可用性;复制优化模块把节点的可用性和用户期望的数据可用性作为输入,计算出在满足用户期望情况下数据复制的份数和位置。提出的复制优化模型能根据云数据存储系统中数据节点可用性实现动态的优化数据复制,能获得较高的存储性价比。模拟实验中基于模糊预测的数据复制优化模型策略需要的存储空间分别是Hadoop策略的42.62%,42.84%,但文件的平均可用性可达到88.69%,90.54%,表明提出的基于模糊预测的复制模型实现了在节省存储空间的同时保证了文件可用性。 The use of multiple data copies is widespread in cloud data processing systems in case of data loss. If the number of data copies or the position of data replication is inappropriate, there＇ s a chance that could cause the availability of data to be unmatched the expecta- tion and a waste of storage spaces, for instance, the copy number is too high. As with this fact, a data replication optimization model based on fuzzy forecasting is presented. It consists of fuzzy forecasting and data replication optimization. The fuzzy forecasting makes use of the information of a node, which includes information of CPU, bandwidth, memory and hard drive, to forecast the availability. Replication op- timization consumes the availability of nodes and user＇ s expectation to calculate the number of data copies and replication position. This model could dynamically optimize data replication through the availability of nodes in a cloud data storage system, which achieves a good performance price tradeoff for data storage. Simulation experiment data replication strategy optimization model based on fuzzy prediction need storage space is Hadoop strategy respectively 42.62% ,42.84% ,while the average availability of documents can reach 88.69% and 90.54% ,showed that the replication model based on fuzzy prediction realized in saves storage space at the same time to ensure the file a- vailability.

作者王理想刘波林伟伟

机构地区华南师范大学计算机学院华南理工大学计算机科学与工程学院

出处《计算机技术与发展》 2013年第12期82-85,91,共5页 Computer Technology and Development

基金广东省自然科学基金项目(10451064101005155 S2011010001754) 广东省科技计划项目(2010B010600032)

关键词模糊逻辑数据复制数据节点可用性 fuzzy logic data replication data node usability

分类号 TP39 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献13

1Zeng W,Zhao Y,Ou K,et al. Research on cloud storage architecture and key technologies[C]. USA:ACM,2009.
2Wu J,Ping L,Ge X,et al. Cloud storage as the infrastructureof cloud computing[ C]. USA:IEEE,2010.
3QianL,Luo Z,Du Y,et al. Cloud computing:An overview[ J],Cloud computing,2009,5931:626-631.
4张丽娜,周润景.Matlab与自适应神经网络模糊推理系统[M].北京:电子工业出版社,2010.
5Jang S H,Kim I K,Lee J S. Node availability - based conges-tion control model using fuzzy logic for computational grid[C]//Proc of FGCN. USA:IEEE,2007.
6Xu J,Zhao M,Fortes J,et al. Autonomic resource managementin virtualized data centers using fuzzy logic-based approaches[J]. Cluster computing,2008,11(3) :213-227.
7Xu J,Zhao M,Fortes J,et al. On the use of fuzzy modeling invirtualized data center management [ C ] //Proc of fourth inter-national conference on autonomic computing. USA: IEEE,2007.
8Zhou J, Yu K,Chou C,et al. A dynamic resource broker andfuzzy logic based scheduling algorithm in grid environment[J ]. Adaptive and natural computing algorithms,2007 ( 1 ):604-613.
9Wang W,Zhang H. A load balancing schedule strategy of webserver cluster[fc]//Proc of e-business and information sys-tem security. USA:IEEE,2009.
10Zbigniew M, David B F.如何求解问题-现代启发式方法[M].北京:水利水电出版社,2003.

二级参考文献15

1Apache Software Foundation. The apache hadoop project [ EB/OL]. [ 2010-06-20 ]. http://hadoop, apache, org/, as of 15/06/2009.
2Wikipedia. Heterogeneous_network [ EB/OL ]. [ 2010-06-20 ]. http ://en. wikipedia, org/wiki/Heterogeneous_network.
3Bhandarkar M, Gogate S, Bhat V. Hadoop performance tuning: a case study[ EB/OL]. [2010-06-20]. http://cloud, citris-uc, org/ system/files/private/BerkeleyPerformanceTuning, pdf.
4Hadoop cluster setup [EB/OL]. [ 2010-06-20 ] . http: ff hadoop, apache, org/common/docs/current/cluster_setup, html.
5Dean J, Ghemawat S. Mapreduce: simplified data processing on large clusters[ C]//Proc of OSDI. 2004: 137-150.
6Chang F, Dean J, Ghemawat S, et al. B igtable: a distributed storage system for structured data [ J]. ACM Trans Comput Syst, 2005,26 (2) : 1-26.
7Boulon J, Konwinski A, Qi R, et al. Chukwa, a large-scale monitoring system [ C ]//Cloud Computing and its Applications. Chicago, IL, October 2008 : 1-5.
8Thusoo A, Sarma J S, Jain N, et al. Hive-A warehousing solution over a map-reduce framework [ J ]. PVLDB, 2009, 2 (2) : 1626-1629.
9Hadoop. Powered by Hadoop [ EB/OL]. [ 2010-06-20 J. http : // wiki. apache, org/hadoop/FoweredBy.
10Murthy A C. Speeding up Hadoop [ EB/OL]. [ 2010-06-20 ]. http ://developer. yahoo, eom/blogs/ydn/posts/2OO9/O9/hadoop_summit_ speeding_up_hadoop/.

共引文献6

1郑晓薇,项明,张大为,刘青昆.Hadoop集群性能参数自动调优信息库系统构建[J].小型微型计算机系统,2014,35(3):538-542. 被引量：2
2石晓辉,汤亮,邹喜红,钱晓渝,易鹏.汽车前驱变速器试验台阶跃响应特性分析[J].重庆理工大学学报（自然科学）,2014,28(2):1-6. 被引量：1
3王春梅,胡玉平,易叶青.Hadoop云计算平台的参数优化算法[J].华中师范大学学报（自然科学版）,2016,50(2):183-189. 被引量：1
4杜丛强,邵增珍.一种有效的 Hadoop 参数优化模型[J].山东师范大学学报（自然科学版）,2016,31(1):31-36.
5王洪艳,熊静琪,孙锐.机电系统协同仿真在实践教学中的应用[J].实验科学与技术,2016,14(2):59-61. 被引量：1
6张少辉,张中军,于来行.异构Hadoop集群下自适应平衡数据存储的大数据放置策略[J].现代电子技术,2016,39(10):49-53. 被引量：3

1许昭鹏.全面获取CPU信息[J].电脑爱好者,2002(20):106-108.
2胡志希,戴新发,徐士伟.一种可配置的虚拟机内存隔离方法[J].计算机与数字工程,2016,44(8):1548-1552.
3李鹏飞.西宁节点带宽升至1G[J].中国教育网络,2011(1):41-41.
4刘志勇.新软速递[J].网管员世界,2008(7):113-114.
5有大师罩着不会吃亏[J].电脑爱好者（普及版）,2012(2):56-57.
6董辉,雷大军.P2P分布式存储系统中冗余策略研究[J].现代计算机,2009,15(9):8-10. 被引量：2
7李锁钢.2012年11月部分节点带宽利用率升至90%[J].中国教育网络,2013(1):33-33.
8余以胜.P2P文件分发的遗传算法优化研究[J].科学技术与工程,2012,20(2):440-443.
9老工.让“老”笔记本电脑快速启动[J].大众软件,2005(13):66-66.
10刘志礼.电脑开机关机要注意[J].农村财务会计,2001(11):42-42.

计算机技术与发展

2013年第12期

浏览历史

内容加载中请稍等...

基于模糊预测的数据复制优化模型的研究

参考文献13

二级参考文献15

共引文献6

相关作者

相关机构

相关主题

浏览历史