摘要
具有间隙约束条件模式匹配问题是序列模式挖掘问题的基础与核心.无重叠模式匹配是其中的一种方法,当前研究是在间隙为正的精确模式匹配,为了进一步增加匹配的灵活性,本文探索了一般间隙近似无重叠模式匹配问题.本文提出一种有效的求解算法,该算法首先将问题转化为网树;然后为了有效地避免可行解丢失,提出近似监测机制以解决该问题;采用迭代搜索最左孩子策略的方式寻找无重叠出现;之后在网树上剪枝找到的无重叠出现,并迭代上述过程直至没有新的无重叠出现产生.最后本文理论分析了算法的空间复杂度和时间复杂度.大量实验结果验证了本文算法具有较好的求解质量及求解效率.
Pattern matching problem with gap constraints is the basis and core of the pattern mining problem.Nonoverlapping pattern matching is one of these problems.Nowadays,researchers mainly focus on the positive gap constraints and exact pattern matching.In order to further increasing the flexibility of the match,this paper explores the nonoverlapping approximation pattern matching with general gaps problem.The paper proposes an effective algorithm to deal with the problem.Firstly,it transforms the problem into a nettree.To avoid missing the feasible solution effectively,it proposes approximate monitoring mechanism to solve the problem.It employs the iterative searching the most left child strategy to find the nonoverlapping occurrence.Then it prunes the found occurrence in the nettree.It iterates the above process,until there is no new nonoverlapping occurrence.Finally,we theoretical analyze the space and time complexities.Experimental results show that the proposed algorithm has better efficiency and quality than other competitive algorithms.
作者
武优西
陈彤
闫文杰
高雪冬
WU You-xi;CHEN Tong;YAN Wen-jie;GAO Xue-dong(School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China;Hebei Province Key Laboratory of Big Data Calculation,Tianjin 300401,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2020年第2期296-302,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61702157)资助.
关键词
模式匹配
网树
一般间隙
近似
无重叠
pattern matching
nettree
general gap
approximation
nonoverlapping