摘要
保序序列模式挖掘旨在时间序列中挖掘保序模式完全相同(最精确)的子序列,其可以用来进行疾病发展趋势预测.但只挖掘最精确的保序模式往往会遗漏一些重要信息.有些保序模式虽然不完全相同,但它们之间仍具有很高的相似性.有鉴于此,本文提出了一种近似保序序列模式挖掘算法(Approximate Order Preserving Pattern Mining:AOPM),该算法能根据输入参数值的不同而挖掘出近似程度不同的保序模式.在候选模式生成方面,AOPM算法采用了基于前后缀拼接的模式融合策略,减少了无意义候选模式的数量.在模式支持度计算方面,AOPM算法首选获取候选模式的全部候选序列,然后在进行模式匹配.本文通过在真实数据集上进行对比实验,验证了AOPM算法的完备性和高效性.
Order-preserving sequence pattern mining aims to mine sub-sequences with the same(most accurate)order-preserving pattern in a time series, which can be used to predict the development trend of diseases.However, only mining the most accurate order-preserving pattern often misses some important information.Although some preserving patterns are not the same, they still have high similarities.In view of this, this paper proposes an approximate order-preserving pattern mining algorithm(Approximate Order Preserving Pattern Mining, AOPM),which can mine order-preserving patterns with different degrees of approximation according to different input parameter values.In terms of candidate pattern generation, the AOPM algorithm uses a pattern fusion strategy based on prefix and suffix splicing, which reduces the number of meaningless candidate patterns.In terms of pattern support calculation, the AOPM algorithm first obtains all candidate sequences of candidate patterns and then performs pattern matching.This paper verifies the completeness and efficiency of the AOPM algorithm through comparative experiments on real data sets.
作者
刘锦
武优西
王月华
李艳
LIU Jin;WU You-xi;WANG Yue-hua;LI Yan(School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China;School of Economics and Management,Hebei University of Technology,Tianjin 300401,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2023年第3期490-496,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61976240)资助。
关键词
模式挖掘
时间序列
保序序列
(δ-γ)距离
模式匹配
pattern mining
order-preserving sequence
(δ-γ)distance
pattern fusion
pattern match