摘要
作为数据挖掘的一个新兴方向,研究人员在时间序列领域提出了用于挖掘相对次序相同的保序模式.尽管现有的保序模式挖掘算法可以有效地找出全部的频繁模式,但在当用户仅对某个特定的模式及其为前缀的模式较为感兴趣时,现有的挖掘算法效率过于低下.为了解决上述问题,本文提出了一种共生保序模式挖掘算法,用于挖掘出以给定模式为前缀的共生保序模式.该算法包括融合准备和计算超模式的支持度两个主要部分,其中,融合准备分为4个步骤:获取模式p的后缀保序模式,计算后缀保序模式的出现,前向验证模式p的出现,后向查找所有可融合模式的出现;在计算超模式的支持度时,提出一种剪枝策略,使得候选模式的个数进一步减少.在真实数据集上,实验结果验证了本文算法的高效性.
As an emerging direction of data mining,researchers presented order-preserving pattern(OPP)in the field of time series,which is used to mine the same relative order.Although existing OPP mining algorithms can effectively find all frequent patterns,they are inefficient when users are only interested in a specific pattern and patterns with the specific pattern as prefix.To solve the above problems,this paper proposes a co-occurrence order-preserving pattern(COOP)mining algorithm,which is used to mine the co-occurrence order-preserving pattern with a given pattern as its prefix.The COOP algorithm consists of two main parts:fusion preparation and calculation of support for the super-patterns.The fusion preparation is divided into four steps:obtain the suffix OPP of pattern p,calculate the occurrence of suffix OPP,forward verify the occurrence of pattern p,and backward search the occurrence of all fusible patterns.In the process of calculating the support of the super-patterns,a pruning strategy is proposed to further reduce the number of candidate patterns.The experimental results verify the efficiency of COOP algorithm on the real datasets.
作者
王珍
武优西
孟玉飞
李艳
WANG Zhen;WU Youxi;MENG Yufei;LI Yan(School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China;School of Economics Management,Hebei University of Technology,Tianjin 300401,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2024年第6期1384-1391,共8页
Journal of Chinese Computer Systems
基金
河北省社会科学基金项目(HB19GL055)资助。
关键词
序列模式挖掘
时间序列
保序模式
共生模式
sequential pattern mining
time series
order-preserving pattern
co-occurrence pattern