摘要
随着互联网服务的快速发展,分布式的微服务应用逐渐取代传统的单体应用成为互联网应用的主要形式之一.微服务应用在具有可伸缩性、容错性、高可用性等优点的同时,也存在着构建繁琐、部署复杂和维护困难等挑战.面向云计算环境的微服务监测与运维是当前的研究热点,但仍然存在粒度较粗、故障定位不准确等缺点.针对以上问题,本文提出了一种基于模式匹配的微服务故障诊断方法.首先,使用注入代理转发请求流量的方式收集并建模微服务的追踪信息;然后,收集系统正常运行下的状态信息,并通过注入已知故障来收集并刻画故障发生后应用的运行状态;最后,将未知故障的执行追踪信息与已知故障的执行追踪信息相匹配,采用字符串编辑距离衡量相似度以诊断可能的故障原因.实验结果表明,该方法可以有效刻画请求的处理执行追踪信息,以微服务为粒度准确定位应用的故障原因.
Along with the rapid development of internet services,the distributed microservice-based application has gradually replaced the traditional application as one of the main forms of Internet applications.Distributed microservicebased applications boast scalability,high fault tolerance,and great availability,but they are often challenged by cumbersome installation,complicated deployment,and difficult maintenance.Kubernetes,as the most popular containerbased cluster management system,is affected by coarse grains,inaccurate fault location,and other weaknesses.To address the above issues,this study proposes a fault detection method based on trace similarity matching:First,use injecting proxy to forward request traffic to collect tracking information about microservices.Then,collect the state information during normal operation of the system and record the performance of the system after the failure occurs by injecting known faults.Finally,take string edit distance as the standard for the execution tracking models of unknown and known faults.The edit distance serves as a standard to measure the similarity,and the possible cause of failure is identified.Experimental results show that the method can accurately describe the processing and execution tracking information of the request and find the cause of system failure with microservices as the granularity.
作者
陈皓
许源佳
王焘
张文博
CHEN Hao;XU Yuan-Jia;WANG Tao;ZHANG Wen-Bo(Institute of Software,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China;State Key Laboratory of Computer Science,Institute of Software,Chinese Academy of Sciences,Beijing 100190,China)
出处
《计算机系统应用》
2021年第5期1-11,共11页
Computer Systems & Applications
基金
国家重点研发计划(2017YFB1400804)
国家自然科学基金(61872344)
北京市自然科学基金(4182070)
中国科学院青年创新促进会人才专项(2018144)。
关键词
云计算
故障诊断
执行轨迹
微服务
cloud computing
fault diagnosis
execution traces
microservices