摘要
利用多维属性关键性能指标(key performance indicators,KPI)的可加性特征,能够实现对大型互联网服务故障的根因定位.由一项或多项异常根因导致的KPI数据变化,会导致大量相关KPI数据值的变化.提出一种基于异常相似性评估和影响力因子的剪枝搜索异常定位模型(pruning search model based on anomaly similarity and effectiveness factor for root cause location,PASER),该模型以多维KPI异常传播模型为基础,提出了衡量候选集合成为根因可能性的异常潜在分数评估方案;基于影响力的逐层剪枝搜索算法,将异常根因的定位时间降低到了平均约5.3 s.此外,针对异常根因定位中所使用的时间序列预测算法的准确性和时效性也进行了对比实验,PASER模型在所使用的数据集上的定位表现达到了0.99的F-score.
Additivity of multidimensional KPIs(key performance indicators)was used to achieve root cause location for large-scale Internet services.The anomaly caused by one or more root causes usually results in the change of a large number of relevant KPIs.A pruning search model based on anomaly similarity and effectiveness factor for root cause location(PASER)was proposed,which indicated the probability of candidate set becoming root cause using potential score based on the anomaly propagation model of multidimensional KPI.The pruning search algorithm used in PASER also managed to reduce the location time to about 5.3 seconds on average.In addition,the selection of time series prediction algorithm was also discussed.PASER had finally achieved a performance of 0.99 F-score on the experimental dataset.
作者
靖宇涵
何波
张凌昕
李天星
王敬宇
刘聪
JING Yu-Han;HE Bo;ZHANG Ling-Xin;LI Tian-Xing;WANG Jing-Yu;LIU Cong(State Key Laboratory of Networking and Switching Technology(Beijing University of Posts and Telecommunications),Beijing 100876,China;EBUPT Information Technology Co.,Ltd.,Beijing 100191,China;China Mobile Research Institute,Beijing 100053,China)
出处
《软件学报》
EI
CSCD
北大核心
2022年第2期738-750,共13页
Journal of Software
基金
国家自然科学基金(62071067)
教育部-中国移动科研基金(MCM20200202)
北京邮电大学-中国移动研究院联合创新中心。
关键词
智能运维
多维KPI
根因定位
剪枝搜索
AIOps
multidimensional KPIs
root cause location
pruning search