摘要
近年来,随着生物信息学、信息检索等领域的发展,串模式匹配问题被不断扩展。其中,具有代表性的是在模式中引入可变长度的通配符而形成带有通配符的模式匹配(PMWL)。该问题定义的灵活性给用户提供了方便,却也造成了求解上的困难。因此,如何在多项式时间内得到更好的匹配解成为研究的焦点。提出了一种启发式的小兵算法。小兵算法通过将PMWL问题转化为路径搜索问题,并借鉴动态剪枝思想,在算法搜索的过程中动态地将不可能的匹配位置剪枝,从而提高解的质量。实验在真实DNA序列上进行,并人工生成了196个模式。结果表明,相比于目前最有效的SAIL算法,小兵算法在绝大多数的尾部有重复字符的模式中可以获得更好的匹配解。
Recently,with the development of bioinformatics,network security and information retrieval,wildcards are playing an increasingly key role in pattern ma-tching problems.Among these,the challenging problem of pattern matching with wildcards and length constraints(PMWL)further extends the flexibility of pattern matching,thus bringing convenience to the user but producing the expense of higher computational complexity.Therefore,how to get a better matching solution in polynomial time has been well studied and developed.Based on the previous work,we presented a heuristic Pawn algorithm for PMWL.After transforming the PMWL problem into path search problem,a dynamic strategy is adopted with a pruning procedure in an alternating iterative manner.Experiments demonstrate that,comparing with the state-of-the-art algorithm,Pawn algorithm can improve the number of matching occurrences in most cases of 196 artificial patterns with certain characteristic and the DNA sequences.
出处
《计算机科学》
CSCD
北大核心
2015年第4期244-248,共5页
Computer Science
基金
国家自然科学基金:港澳学者合作研究基金项目(61229301)
国家自然科学基金项目(60828005)
博士后面上基金项目(2012M511403)
安徽省自然科学基金(2013AKZR0082)资助
关键词
模式匹配
通配符
剪枝
约束
Pattern matching
Wildcard
Pruning
Constraints