摘要
信息系统在执行过程中收集了大量的业务流程事件日志,模型发现旨在从事件日志的行为信息中发现流程模型为业务流程理解和改进提供事实依据。直接跟随活动关系(DF)作为事件日志中最基本的行为信息是模型发现算法的基础。根据是否考虑日志的DF频次特征,将已有模型发现算法分为考虑频次和不考虑频次两类。已有面向模型发现的日志采样方法注重于提高模型发现的效率,却损失了事件日志中DF频次信息,得到的样本日志在使用基于DF频次的模型发现算法时改变了原始日志的行为。因此,针对基于DF频次的模型发现算法,提出一种面向行为不变性的日志采样方法,具体而言,该方法包括通过按比率选取轨迹变体及频次、计算轨迹的DF权重和基于集合覆盖采样三个阶段,使得样本日志包含的行为信息与原始日志一致。通过公开事件日志数据集上的实验分析表明,与已有的日志采样方法比较本文方法得到的样本日志能更准确地保留原始日志中的DF频次信息,从而确保更高的模型挖掘质量。
Considerable amounts of business process event logs are collected by information systems,model discovery aims to discover process models from event logs to provide evidence for business process improvement.As the most basic behavior information in the event log,Directly Follow relation(DF)is the basis of the model discovery algorithm.According to the frequency of the directly follow relation in the event log,the existing model discovery algorithms can be divided into two types:with frequency and without frequency.The existing log sampling methods for model discovery focus on improving the efficiency of model discovery,but lose the DF frequency information in the event log.The sample log obtained changes the behavior of the original log when using the DF frequency-based model discovery algorithm.Therefore,for the DF frequency-based model discovery algorithm,a behavior invariance-oriented event log sampling method was proposed,which included three-stage sampling process of reducing the frequency of trace variants,calculating the DF weight of the trace and one-time set coverage sampling method to ensure that the behavior of the process model mined with the sample event log and the original log was consistent.Through the experimental analysis on the public event log data set,compared with the existing log sampling methods,the proposed sample log could more accurately retain the DF frequency information in the original log,thus ensuring a higher quality of model mining.
作者
张帅鹏
刘聪
苏轩
闻立杰
宋容嘉
曾庆田
ZHANG Shuaipeng;LIU Cong;SU Xuan;WEN Lijie;SONG Rongjia;ZENG Qingtian(School of Computer Science and Technology,Shandong University of Technology,Zibo 255000,China;College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266590,China;School of Software,Tsinghua University,Beijing 100084,China;School of Management,Hangzhou University of Electronic Science and Technology,Hangzhou 310018,China)
出处
《计算机集成制造系统》
EI
CSCD
北大核心
2024年第8期2809-2821,共13页
Computer Integrated Manufacturing Systems
基金
国家自然科学基金资助项目(62472264)
山东省泰山学者工程专项基金资助项目(ts20190936,tsqn201909109)
山东省自然科学基金优秀青年基金(ZR2021YQ45)
山东省高等学校青创科技计划创新团队资助项目(2021KJ031).