摘要
对比序列模式挖掘作为序列模式挖掘领域的一个重要分支,可以有效识别不同类别间差异显著的模式,并被广泛应用在序列分类、特征提取等场景中.但传统的对比序列模式挖掘仅考虑了模式在序列中是否出现,忽略了模式在序列中的重复性;并且需要用户预先设置间隙约束值,导致算法的灵活性较差.为了解决上述问题,本文提出一次性条件下自适应对比序列模式挖掘算法OSCP,该算法采用逆向填充策略计算模式支持度,不仅关注了模式在序列中的具体出现情况,还提高了算法的计算效率;同时采用模式连接策略以减少候选模式数量.此外,本文采用自适应间隙,无需用户预先设置间隙约束,可基于序列的实际特征计算模式的支持度.实验结果表明,OSCP算法的挖掘性能和分类效果均优于其他对比算法.
As an important branch of sequential pattern mining(SPM),contrast SPM can effectively identify patterns with significant differences between different categories,and is widely used in sequence classification,feature extraction and other scenarios.However,traditional contrast SPM only considers whether the pattern occurs in the sequence,ignoring the repetition of the pattern.In addition,traditional contrast SPM requires users to set gap constrains in advance,which leads to poor flexibility.To solve the above problems,this paper proposes a self-adaptive contrast pattern mining algorithm OSCP under one-off condition.The algorithm adopts the reverse filling strategy to calculate the support of patterns,which not only pays attention to the specific occurrence of the pattern in the sequence,but also improves the computational efficiency.The pattern join strategy is employed to reduce the number of candidate patterns.In addition,the self-adaptive gap is used to calculate support based on the features of the sequence without setting gap constraints.The experimental results show that OSCP outperforms other competitive algorithms and has a better classification effect.
作者
谢婷萱
武优西
王月华
李艳
XIE Tingxuan;WU Youxi;WANG Yuehua;LI Yan(School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China;School of Economics and Management,Hebei University of Technology,Tianjin 300401,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2024年第8期1808-1815,共8页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61976240)资助.
关键词
序列模式挖掘
对比模式
候选模式生成
序列分类
sequential pattern mining
contrast pattern
generate candidate patterns
sequence classification