摘要
云原生技术的引入使得IT系统规模庞大、架构复杂,IT运维迎接新的挑战,IT业务系统集群规模越发庞大,传统的告警不能及时有效地发现系统异常,海量日志无法有效分析,业务调用链复杂,可观测性差,导致故障定界定位极其困难。利用大数据、AI技术、自动化编排等前沿技术手段,开发了业务端到端故障智能发现诊断自愈解决方案。有效融合metrics、log、trace三类数据,实现故障自动发现、诊断、自愈,开启极简运维时代。
The introduction of cloud-native technology has made IT systems large in scale and complex in structure, and IT operation and maintenance are facing new challenges. The scale of the IT service system cluster is becoming larger and larger, the traditional alarm cannot be able to find the system abnormalities in time and effectively, the massive logs cannot be effectively analyzed, the service call chain is complex, and the observability is poor, which is extremely difficult to locate the fault boundaries. Using cutting-edge technologies such as big data, AI technology, and automated orchestration, it develops a business end-to-end fault intelligent discovery, diagnosis, and self-healing solution. It effectively integrates metrics, log and trace three types of data to realize automatic fault discovery, diagnosis and self-healing, so as to open the era of minimalist operation and maintenance.
作者
左金虎
陈理华
肖忠良
ZUO Jinhu;CHEN Lihua;XIAO Zhongliang(China Mobile Information Technology Co.,Ltd.,Beijing 102200,China)
出处
《现代信息科技》
2022年第24期85-89,共5页
Modern Information Technology
关键词
业务端到端
故障智能诊断
故障自愈
AIOps
business end-to-end
intelligent fault diagnosis
fault self-healing
AIOps