摘要
由异常值和缺失值导致的低质量事件日志在实际的业务流程中通常不可避免,低质量的事件日志会降低过程挖掘相关算法的性能,从而干扰决策的正确实施。在系统参考模型未知的条件下,现有方法在进行日志异常检测与修复工作中,存在需要人为设定阈值、不知预测模型学习何种行为约束以及修复结果可解释性较差的问题。采用遮掩策略的预训练语言模型BERT可以通过上下文信息自监督地学习文本中的通用语义,受此启发,提出了模型BERT4Log和弱行为轮廓理论,并结合多层多头注意力机制进行低质量事件日志的可解释修复。所提修复方法不需要预先设定阈值,仅需要进行一次自监督训练,同时该方法利用弱行为轮廓理论量化行为上的日志修复程度,并结合多层多头注意力机制实现对具体预测结果的详细解释。最后,在一组公开数据集上对方法性能进行评估,并与目前性能最优的研究进行对比分析,实验结果表明BERT4Log的修复性能整体优于对比方法,可以学习弱行为轮廓并实现修复结果的详细解释。
In practical business processes,low-quality event logs due to outliers and missing values are often unavoidable.Low-quality event logs can degrade the performance of associated algorithms for process mining,which in turn interferes with the correct implementation of decisions.Under the condition that the system reference model is unknown,when performing log anomaly detection and repair work,the existing methods have the problems of needing to manually set thresholds,do not understand what behavior constraints the prediction model learns,and poor interpretability of repair results.Inspired by the fact that the pre-trained language model BERT using the masking strategy can self-supervise learning of general semantics in text through context information,combined with attention mechanism with multi-layer and multi-head,this paper proposes the model BERT4Log and weak behavioral profiles theory to perform an interpretable repair process for low-quality event logs.The proposed repair method does not need to set a threshold in advance,and only needs to perform self-supervised training once.At the same time,the method uses the weak behavioral profiles theory to quantify the degree of behavioral repair of logs.And combined with the multi-layer multi-head attention mechanism to realize the detailed interpretation process about the specific prediction results.Finally,the performance of the proposed method is evaluated on a set of public datasets,and compared with the current research with the best performance.Experimental results show that the repair performance of BERT4Log is better than the comparative research,and at the same time,the model can learn weak behavioral profiles and achieve detailed interpretation of repair results.
作者
李炳辉
方欢
梅振辉
LI Binghui;FANG Huan;MEI Zhenhui(School of Mathematics and Big Data,Anhui University of Science and Technology,Huainan,Anhui 232001,China;Anhui Province Engineering Laboratory for Big Data Analysis and Early Warning Technology of Coal Mine Safety,Huainan,Anhui 232001,China)
出处
《计算机科学》
CSCD
北大核心
2023年第5期38-51,共14页
Computer Science
基金
国家自然科学基金(61902002)。