摘要
为解决虚拟化条件下云平台故障排除不及时的问题,在开源云平台Open Stack上设计并实现一种虚拟化故障检测恢复系统。该系统由GUI层、调度层、逻辑层和功能层组成,以事件驱动机制为核心,将系统中传递的信息作为事件按时序进行处理。以感知模块、策略模块、执行模块为主体,调用Open Stack API和Libvirt API实现与虚拟机管理层的交互。建立以信息获取、分析处理、故障恢复为主要内容的故障检测恢复体系,通过对云平台运行环境的实时检测,获取状态参数,根据策略对参数进行分析判断并制定应对措施,实现对故障的自动恢复。实验结果证明,该系统可以在无代理情况下对云平台进行实时检测和故障自动恢复,增强云环境的安全性,提升云平台的高可用性。
In order to solve the problem that the fault troubleshooting of cloud platforms is not timely,and guarantee the continuity of cloud services,this paper designs and implements a virtualization fault detection and recovery system based on event-driven mechanism,which is on the open-source cloud platform——OpenStack.The system is composed of GUI layer,scheduling layer logic layer and functional layer,and processes the information transmitted in the system by timing as an event on the basis of event-driven mechanism.It mainly uses perception module,policy module and execution module,which call OpenStack API and Libvirt API to interact with the management of virtual machines.The established fault detection recovery system mainly includes information acquisition,analysis and processing,fault recovery,and by real-time detection of the cloud platform ' s runtime environment,it can obtain state parameters,analyze the parameters and develop countermeasures according to established policy,and achieve automatic fault recovery.Experimental results show that the system can detect and recover cloud platforms' fault with agentless method /enhance the security of cloud environments,and improve the high availability of cloud platforms.
出处
《计算机工程》
CAS
CSCD
北大核心
2015年第2期7-11,16,共6页
Computer Engineering
基金
国家"863"计划基金资助项目(2013AA12A206)
国家自然科学基金资助项目(41104010
91120002
61170026)
中央高校基本科研业务费专项基金资助项目(2042014kf0237)