In modern computer systems, system event logs have always been the primary source for checking system status. As computer systems become more and more complex, the interaction between software and hardware increases f...In modern computer systems, system event logs have always been the primary source for checking system status. As computer systems become more and more complex, the interaction between software and hardware increases frequently. The components will generate enormous log information, including running reports and fault information. The sheer quantity of data is a great challenge for analysis relying on the manual method. In this paper, we implement a management and analysis system of log information, which can assist system administrators to understand the real-time status of the entire system, classify logs into different fault types, and determine the root cause of the faults. In addition, we improve the existing fault correlation analysis method based on the results of system log classification. We apply the system in a cloud computing environment for evaluation. The results show that our system can classify fault logs automatically and effectively. With the proposed system, administrators can easily detect the root cause of faults.展开更多
基金This work was supported by the National Basic Research 973 Program of China under Grant No. 2014CB340600, the National Natural Science Foundation of China under Grant No. 61272072, and the Program for New Century Excellent Talents in University of China under Grant No. NCET-13-0241.
文摘In modern computer systems, system event logs have always been the primary source for checking system status. As computer systems become more and more complex, the interaction between software and hardware increases frequently. The components will generate enormous log information, including running reports and fault information. The sheer quantity of data is a great challenge for analysis relying on the manual method. In this paper, we implement a management and analysis system of log information, which can assist system administrators to understand the real-time status of the entire system, classify logs into different fault types, and determine the root cause of the faults. In addition, we improve the existing fault correlation analysis method based on the results of system log classification. We apply the system in a cloud computing environment for evaluation. The results show that our system can classify fault logs automatically and effectively. With the proposed system, administrators can easily detect the root cause of faults.