The degradation data of multi-components in missile is derived by periodical testing. How to use these data to assess the storage reliability (SR) of the whole missile is a difficult problem in current research. An SR...The degradation data of multi-components in missile is derived by periodical testing. How to use these data to assess the storage reliability (SR) of the whole missile is a difficult problem in current research. An SR assessment model based on competition failure of multi-components in missile is proposed. By analyzing the missile life profile and its storage failure feature, the key components in missile are obtained and the characteristics voltage is assumed to be its key performance parameter. When the voltage testing data of key components in missile are available, a state space model (SSM) is applied to obtain the whole missile degradation state, which is defined as the missile degradation degree (DD). A Wiener process with the time-scale model (TSM) is applied to build the degradation failure model with individual variability and nonlinearity. The Weibull distribution and proportional risk model are applied to build an outburst failure model with performance degradation effect. Furthermore, a competition failure model with the correlation between degradation failure and outburst failure is proposed. A numerical example with a set of missiles in storage is analyzed to demonstrate the accuracy and superiority of the proposed model.展开更多
The explosive initiator is one kind of sensitivity products with long life and high reliability.In order to improve the storage reliability assessment,the method of storage reliability assessment for explosive initiat...The explosive initiator is one kind of sensitivity products with long life and high reliability.In order to improve the storage reliability assessment,the method of storage reliability assessment for explosive initiator was proposed based on time series model using the sensitivity test data.In the method,the up and down test was used to estimate the distribution parameters of threshold.And an approach to design the up and down test was present to draw better estimations.Furthermore,the method of shrinkage estimation was introduced to get a better estimation of scale parameter by combining the sample information with prior information.The simulation result shows that the shrinkage estimation is better than traditional estimation under certain conditions.With the distribution parameters estimations,the time series models were used to describe the changing trends of distribution parameters along with storage time.Then for a fixed storage time,the distribution parameters were predicted based on the models.Finally,the confidence interval of storage reliability was obtained by fiducial inference.The illustrative example shows that the method is available for storage reliability assessment of the explosive initiator with high reliability.展开更多
In the era of big data,sensor networks have been pervasively deployed,producing a large amount of data for various applications.However,because sensor networks are usually placed in hostile environments,managing the h...In the era of big data,sensor networks have been pervasively deployed,producing a large amount of data for various applications.However,because sensor networks are usually placed in hostile environments,managing the huge volume of data is a very challenging issue.In this study,we mainly focus on the data storage reliability problem in heterogeneous wireless sensor networks where robust storage nodes are deployed in sensor networks and data redundancy is utilized through coding techniques.To minimize data delivery and data storage costs,we design an algorithm to jointly optimize data routing and storage node deployment.The problem can be formulated as a binary nonlinear combinatorial optimization problem,and due to its NP-hardness,designing approximation algorithms is highly nontrivial.By leveraging the Markov approximation framework,we elaborately design an efficient algorithm driven by a continuous-time Markov chain to schedule the deployment of the storage node and corresponding routing strategy.We also perform extensive simulations to verify the efficacy of our algorithm.展开更多
We present Fatman, an enterprise-scale archival storage based on volunteer contribution resources from underutilized web servers, usually deployed on thousands of nodes with spare storage capacity. Fatman is specifica...We present Fatman, an enterprise-scale archival storage based on volunteer contribution resources from underutilized web servers, usually deployed on thousands of nodes with spare storage capacity. Fatman is specifically designed for enhancing the utilization of existing storage resources and cutting down the hardware purchase cost. Two major concerned issues of the system design are maximizing the resource utilization of volunteer nodes without violating service level objectives (SLOs) and minimizing the cost without reducing the availability of archival system. Fatman has been widely deployed on tens of thousands of server nodes across several datacenters, providing more than 100 PB storage capacity and serving dozens of internal mass-data applications. The system realizes an efficient storage quota consolidation by strong isolation and budget limitation, to maximally support resource contribution without any degradation on host-level SLOs. It novelly improves data reliability by applying disk failure prediction to minish failure recovery cost, named fault-aware data management, dramatically reduces the mean time to repair (MTTR) by 76.3% and decreases file crash ratio by 35% on real-life product workload.展开更多
Disk failure prediction methods have been useful in handing a single issue,e.g.,heterogeneous disks,model aging,and minority samples.However,because these issues often exist simultaneously,prediction models that can h...Disk failure prediction methods have been useful in handing a single issue,e.g.,heterogeneous disks,model aging,and minority samples.However,because these issues often exist simultaneously,prediction models that can handle only one will result in prediction bias in reality.Existing disk failure prediction methods simply fuse various models,lacking discussion of training data preparation and learning patterns when facing multiple issues,although the solutions to different issues often conflict with each other.As a result,we first explore the training data preparation for multiple issues via a data partitioning pattern,i.e.,our proposed multi-property data partitioning(MDP).Then,we consider learning with the partitioned data for multiple issues as learning multiple tasks,and introduce the model-agnostic meta-learning(MAML)framework to achieve the learning.Based on these improvements,we propose a novel disk failure prediction model named MDP-MAML.MDP addresses the challenges of uneven partitioning and difficulty in partitioning by time,and MAML addresses the challenge of learning with multiple domains and minor samples for multiple issues.In addition,MDP-MAML can assimilate emerging issues for learning and prediction.On the datasets reported by two real-world data centers,compared to state-of-the-art methods,MDP-MAML can improve the area under the curve(AUC)and false detection rate(FDR)from 0.85 to0.89 and from 0.85 to 0.91,respectively,while reducing false alarm rate(FAR)from 4.88%to 2.85%.展开更多
基金supported by the National Defense Foundation of China(71601183)
文摘The degradation data of multi-components in missile is derived by periodical testing. How to use these data to assess the storage reliability (SR) of the whole missile is a difficult problem in current research. An SR assessment model based on competition failure of multi-components in missile is proposed. By analyzing the missile life profile and its storage failure feature, the key components in missile are obtained and the characteristics voltage is assumed to be its key performance parameter. When the voltage testing data of key components in missile are available, a state space model (SSM) is applied to obtain the whole missile degradation state, which is defined as the missile degradation degree (DD). A Wiener process with the time-scale model (TSM) is applied to build the degradation failure model with individual variability and nonlinearity. The Weibull distribution and proportional risk model are applied to build an outburst failure model with performance degradation effect. Furthermore, a competition failure model with the correlation between degradation failure and outburst failure is proposed. A numerical example with a set of missiles in storage is analyzed to demonstrate the accuracy and superiority of the proposed model.
文摘The explosive initiator is one kind of sensitivity products with long life and high reliability.In order to improve the storage reliability assessment,the method of storage reliability assessment for explosive initiator was proposed based on time series model using the sensitivity test data.In the method,the up and down test was used to estimate the distribution parameters of threshold.And an approach to design the up and down test was present to draw better estimations.Furthermore,the method of shrinkage estimation was introduced to get a better estimation of scale parameter by combining the sample information with prior information.The simulation result shows that the shrinkage estimation is better than traditional estimation under certain conditions.With the distribution parameters estimations,the time series models were used to describe the changing trends of distribution parameters along with storage time.Then for a fixed storage time,the distribution parameters were predicted based on the models.Finally,the confidence interval of storage reliability was obtained by fiducial inference.The illustrative example shows that the method is available for storage reliability assessment of the explosive initiator with high reliability.
基金partially supported by the Shandong Provincial Natural Science Foundation(No.ZR2017QF005)the National Natural Science Foundation of China(Nos.61702304,61971269,61832012,61602195,61672321,61771289,and 61602269)the China Postdoctoral Science Foundation(No.2017M622136)。
文摘In the era of big data,sensor networks have been pervasively deployed,producing a large amount of data for various applications.However,because sensor networks are usually placed in hostile environments,managing the huge volume of data is a very challenging issue.In this study,we mainly focus on the data storage reliability problem in heterogeneous wireless sensor networks where robust storage nodes are deployed in sensor networks and data redundancy is utilized through coding techniques.To minimize data delivery and data storage costs,we design an algorithm to jointly optimize data routing and storage node deployment.The problem can be formulated as a binary nonlinear combinatorial optimization problem,and due to its NP-hardness,designing approximation algorithms is highly nontrivial.By leveraging the Markov approximation framework,we elaborately design an efficient algorithm driven by a continuous-time Markov chain to schedule the deployment of the storage node and corresponding routing strategy.We also perform extensive simulations to verify the efficacy of our algorithm.
文摘We present Fatman, an enterprise-scale archival storage based on volunteer contribution resources from underutilized web servers, usually deployed on thousands of nodes with spare storage capacity. Fatman is specifically designed for enhancing the utilization of existing storage resources and cutting down the hardware purchase cost. Two major concerned issues of the system design are maximizing the resource utilization of volunteer nodes without violating service level objectives (SLOs) and minimizing the cost without reducing the availability of archival system. Fatman has been widely deployed on tens of thousands of server nodes across several datacenters, providing more than 100 PB storage capacity and serving dozens of internal mass-data applications. The system realizes an efficient storage quota consolidation by strong isolation and budget limitation, to maximally support resource contribution without any degradation on host-level SLOs. It novelly improves data reliability by applying disk failure prediction to minish failure recovery cost, named fault-aware data management, dramatically reduces the mean time to repair (MTTR) by 76.3% and decreases file crash ratio by 35% on real-life product workload.
基金Project supported by the National Natural Science Foundation of China(No.61902135)the Shandong Provincial Natural Science Foundation,China(No.ZR2019LZH003)。
文摘Disk failure prediction methods have been useful in handing a single issue,e.g.,heterogeneous disks,model aging,and minority samples.However,because these issues often exist simultaneously,prediction models that can handle only one will result in prediction bias in reality.Existing disk failure prediction methods simply fuse various models,lacking discussion of training data preparation and learning patterns when facing multiple issues,although the solutions to different issues often conflict with each other.As a result,we first explore the training data preparation for multiple issues via a data partitioning pattern,i.e.,our proposed multi-property data partitioning(MDP).Then,we consider learning with the partitioned data for multiple issues as learning multiple tasks,and introduce the model-agnostic meta-learning(MAML)framework to achieve the learning.Based on these improvements,we propose a novel disk failure prediction model named MDP-MAML.MDP addresses the challenges of uneven partitioning and difficulty in partitioning by time,and MAML addresses the challenge of learning with multiple domains and minor samples for multiple issues.In addition,MDP-MAML can assimilate emerging issues for learning and prediction.On the datasets reported by two real-world data centers,compared to state-of-the-art methods,MDP-MAML can improve the area under the curve(AUC)and false detection rate(FDR)from 0.85 to0.89 and from 0.85 to 0.91,respectively,while reducing false alarm rate(FAR)from 4.88%to 2.85%.