The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which...The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.展开更多
不间断电源(uninterruptible power system,UPS)是数据中心供电连续性的重要保障。根据三母线供电方案原理,提出了四母线冗余供电方案,UPS负载率从三母线方案的66.6%提升至75%,同时具备双总线系统的可靠性等级。文章提出一种大型数据中...不间断电源(uninterruptible power system,UPS)是数据中心供电连续性的重要保障。根据三母线供电方案原理,提出了四母线冗余供电方案,UPS负载率从三母线方案的66.6%提升至75%,同时具备双总线系统的可靠性等级。文章提出一种大型数据中心UPS四母线冗余供电方案,该方案在不降低可靠性等级的前提下降低了建设成本,为数据中心供电系统设计提供借鉴和参考。展开更多
基金supported in part by National Key Research and Development Program of China(2019YFB2103200)NSFC(61672108),Open Subject Funds of Science and Technology on Information Transmission and Dissemination in Communication Networks Laboratory(SKX182010049)+1 种基金the Fundamental Research Funds for the Central Universities(5004193192019PTB-019)the Industrial Internet Innovation and Development Project 2018 of China.
文摘The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.
文摘不间断电源(uninterruptible power system,UPS)是数据中心供电连续性的重要保障。根据三母线供电方案原理,提出了四母线冗余供电方案,UPS负载率从三母线方案的66.6%提升至75%,同时具备双总线系统的可靠性等级。文章提出一种大型数据中心UPS四母线冗余供电方案,该方案在不降低可靠性等级的前提下降低了建设成本,为数据中心供电系统设计提供借鉴和参考。