摘要
传统的模糊挖掘算法在处理大规模数据集时表现不佳,缺乏处理长距离依赖关系的能力,而且使用比较复杂,需要手动配置相关参数。针对这些问题,提出一种自适应的并行化模糊挖掘(APFM)算法。该算法可以进行自动化参数配置,通过并行化的方法完成大规模数据集的处理,提高数据处理效率。APFM算法将建模过程也进行了优化。从整体、局部两个角度综合分析完成活动关系的处理;通过一种自底向上的方法获取流程模型的活动集合;通过计算长距离依赖因子挖掘流程模型中的长距离依赖关系。实验证明,在大规模数据集的处理场景下,APFM算法可以高效地完成数据处理,得到更加精准的流程模型。
The traditional fuzzy mining algorithm does not perform well in dealing with large-scale datasets,and lacks the ability of dealing with long distance dependency relationship.And it is complicated to use,and requires manual configuration of relevant parameters.To solve these problems,we propose an adaptive parallel fuzzy mining algorithm(APFM),which can automatically configure parameters,deal with large-scale datasets through parallel methods and improve the efficiency of data processing.APFM optimized the process of modeling.Comprehensive analysis was completed from the overall and partial perspectives and the activity relationship was dealt with.It obtained the activity-set of the process model through the bottom-to-up method.And it mined the long distance dependency relationship in the process model by calculating the long distance dependency factor.The result shows that APFM can efficiently complete data processing and get more accurate process model in the large-scale dataset processing scenario.
作者
赵洪博
Zhao Hongbo(School of Software,Fudan University,Shanghai 200433,China)
出处
《计算机应用与软件》
北大核心
2022年第6期288-296,共9页
Computer Applications and Software
关键词
模糊挖掘
自动参数配置
并行优化
建模优化
Fuzzy mining
Automatic parameter configuration
Parallel optimization
Modeling optimization