摘要
Modern cloud services are monitored by numerous multidomain and multivendor monitoring tools,which generate massive numbers of alerts and events that are not actionable.These alerts usually carry isolated messages that are missing service contexts.Administrators become inundated with tickets caused by such alert events when they are routed directly to incident management systems.Noisy alerts increase the risk of crucial warnings going undetected and leading to service outages.One of the feasible ways to cope with the above problems involves revealing the correlations behind a large number of alerts and then aggregating the related alerts according to their correlations.Based on these guidelines,AlertInsight,a framework for alert event reduction,is proposed in this paper.In AlertInsight,the correlations among event sources are found by mining a sequence of historical events.Then,event correlation knowledge is employed to build an online detector targeting the correlated events that are hidden in the event stream.Finally,the correlated events are aggregated into a single high-level event for alert reduction.Because of theweaknesses of the commonly used pairwise correlation analysis methods in complex environments,an innovative approach for multiple correlation mining,which overcomes computational complexity challenges by scanning panoramic views of historical episodes from the perspective of holism,is proposed in this paper.In addition,a neural network-based correlated event detector that can learn the event correlation knowledge generated from correlation mining and then detect the correlated events in a sequence online is proposed.Experiments are conducted to test the effectiveness of AlertInsight.The experimental results(precision=0.92,recall=0.93,and F1-score=0.93)demonstrate the performance of AlertInsight for the recognition of multiple correlated alerts and its competence for alert reduction.