期刊文献+

主动容错副本存储系统的可靠性分析模型 被引量:2

Reliability analysis models for replication-based storage systems with proactive fault tolerance
下载PDF
导出
摘要 主动容错机制通过预先发现即将故障的硬盘来提醒系统提前迁移备份危险数据,从而显著提高存储系统的可靠性。针对现有研究无法准确评价主动容错副本存储系统可靠性的问题,提出几种副本存储系统的状态转换模型,然后利用蒙特卡洛仿真算法实现了该模型,从而模拟主动容错副本存储系统的运行,最后统计系统在某个运行时期内发生数据丢失事件的期望次数。采用韦布分布函数模拟设备故障和故障修复事件的时间分布,并定量评价了主动容错机制、节点故障、节点故障修复、硬盘故障以及硬盘故障修复事件对存储系统可靠性的影响。实验结果表明,当预测模型的准确率达到50%时,系统的可靠性可以提高1~3倍;与二副本系统相比,三副本系统对系统参数更敏感。所提模型可以帮助系统管理者比较权衡不同的容错方式以及系统参数下的系统可靠性水平,从而搭建高可靠和高可用的存储系统。 Proactive fault tolerance mechanism,which predicts disk failures and prompts the system to perform migration and backup for the data in danger in advance,can be used to enhance the storage system reliability.In view of the problem that the reliability of the replication-based storage systems with proactive fault tolerance cannot be evaluated by the existing research accurately,several state transition models were proposed for replication-based storage systems;then the models were implemented based on Monte Carlo simulation,so as to simulate the running of the replication-based storage systems with proactive fault tolerance;at last,the expected number of data-loss events during a period in the systems was counted.The Weibull distribution function was used to model the time distribution of device failure and failure repair events,and the impact of proactive fault tolerance mechanism,node failures,node failure repairs,disk failures and disk failure repairs on the system reliability were evaluated quantitatively.Experimental results showed that when the accuracy of the prediction model reached 50%,the reliability of the systems were able to be improved by 1-3 times,and compared with 2-way replication systems,3-way replication systems were more sensitive to system parameters.By using the proposed models,system administrators can easily assess system reliability under different fault tolerance schemes and system parameters,and then build storage systems with high reliability and high availability.
作者 李静 罗金飞 李炳超 LI Jing;LUO Jinfei;LI Bingchao(College of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China;College of Computer Science,Nankai University,Tianjin 300350,China)
出处 《计算机应用》 CSCD 北大核心 2021年第4期1113-1121,共9页 journal of Computer Applications
基金 国家自然科学基金青年科学基金资助项目(61702521) 中央高校基本科研业务费专项资金资助项目(3122019122)。
关键词 主动容错 副本存储系统 可靠性分析 节点故障 硬盘故障 韦布分布 系统状态转换 proactive fault tolerance replication-based storage system reliability analysis node failure disk failure Weibull distribution system state transition
  • 相关文献

参考文献2

二级参考文献15

  • 1Wilcke W W,et al.IBM intelligent bricks project petabytes and beyond[J].IBM Journal of Research and Development,2006,50(2/3):181-197.
  • 2Ganger G R,Strunk J D,Klosterman A J.Self-*Storage:Brick-based storage with automated administration CMU-CS-03-178[R].Pittsburgh:Carnegie Mellon University,2003.
  • 3Saito Y,Frolund S,Veitch A,et al.FAB:Building distributed enterprise disk arrays from commodity components[J].Operating Systems Review,2004,38(5):48-58.
  • 4Trivedi K S.Probability and Statistics with Reliability,Queueing and Computer Science Applications[M].New York:John Wiley and Sons Ltd,2001.
  • 5Patterson D A,Gibson G A,Katz R H.A case for redundant arrays of inexpensive disks (RAID)[C]//Proc of ACM SIGMOD Int Conf on Management of Data.New York:ACM,1988:109-116.
  • 6Gibson G A.Redundant Disk Arrays:Reliable,Parallel Secondary Storage[M].Cambridge:MIT Press,1992.
  • 7Xin Q,Miller E L,Schwarz T,et al.Reliability mechanisms for very large storage systems[C]//IEEE Symp on Mass Storage Systems.Washington:IEEE Computer Society,2003:146-156.
  • 8Thomasian A.Shortcut method for reliability comparisons in RAID[J].The Journal of Systems and Software,2006,79(11):1599-1605.
  • 9Gray J,Shenoy P J.Rules of thumb in data engineering[C]//Proc of the 16th Int Conf on Data Enginerring.Washington:IEEE Computer Society,2000:3-10.
  • 10Baek S H,Kim B W,Jeung E,et al.Reliability and performance of hierarchical RAID with multiple controllers[C]//ACM Symp on Principles of Distributed Computing.New York:ACM,2001:246-254.

共引文献14

同被引文献9

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部