摘要
以Web集群服务器的后端节点作为研究对象,通过减少后端节点的MTTR(Mean Time to Repair),来提高它们的可用性,从而提高整个集群服务器的可用性。首先,通过分析现有的故障恢复方案的不足,提出了新的改进方案,在新的方案中,采用了动态地检测和发送状态信息的策略,并引入了故障猜测状态,弥补了现有方案的不足。最后设计了一个试验环境,与现有的方案测试相比,使用改进的方案,MTTR(平均故障修复时间)减少了63%,很好地提高了后端节点的可用性。
In research, back-end nodes in the cluster-based Web server are took as the research object. The research goal is to improve the availability of the back-end nodes by reducing their MTTR(mean time to repair). In this paper, firstly, new scheme is put forward on the basis of analyzing of the shortage of the current failure recovery scheme. In the new scheme, state messages of back-end nodes are detected and sent dynamically and failure suspicion state of back-end nodes is introduced, all of them are used to remedy the shortage of existed scheme. Secondly, an experimental environment is developed to test the researching result, The experiment result shows that MTTR decreases 63% in the scheme as compared to the existed one, which leads to the improvement of the availability of the back-end nodes, and then the availability of cluster based Web server is improved.
出处
《计算机工程》
EI
CAS
CSCD
北大核心
2006年第8期121-123,共3页
Computer Engineering
基金
国家"863"计划基金重点资助项目(2004AA111110)
关键词
可用性
Wcb集群服务器
故障检测
故障恢复
Availability
Cluster-based Web server
Failure detection
Failure recovery