摘要
随着计算机体系结构、计算规模的不断扩大,相比于单一计算节点,集群内部出现故障的可能性显著提升,故障已经成为一种常态。主动冗余技术,是保证系统可靠性的常用方式。故障预测,在主动冗余技术中起着至关重要的作用。通过故障预测,可以对集群中计算节点的运行状态进行评估、判断,保证计算节点在真正的故障出现之前,完成节点的失效转移,从而提高系统的可靠性。提出适用于信息设备的故障预测的相关定义、评估标准,并提出一种适用于企业级应用部署的状态检修方案。
With the enlargement of computing scale, faults are more likely to appear in computing factory compared with single computing node, and faults have been becoming a common problem. Active Redundancy is the most effective method to guarantee the robustness of system. Faults prediction is of vital importance in active redundancy. By faults prediction, devices" health status can be evaluated and side effects of faults can be detected before the real faults appear in order to failover. Describes the relevant definition, evaluation standard of faults prediction in information devices area, puts forward a CBM based scheme adapt to enterprise level application, development and deploy- ment.
出处
《现代计算机》
2016年第3期70-74,80,共6页
Modern Computer