期刊文献+

一种带有通配符和长度约束模式匹配问题的动态剪枝算法 被引量:1

Dynamic Pruning Algorithm for Pattern Matching with Wildcards and Length Constraints
下载PDF
导出
摘要 近年来,随着生物信息学、信息检索等领域的发展,串模式匹配问题被不断扩展。其中,具有代表性的是在模式中引入可变长度的通配符而形成带有通配符的模式匹配(PMWL)。该问题定义的灵活性给用户提供了方便,却也造成了求解上的困难。因此,如何在多项式时间内得到更好的匹配解成为研究的焦点。提出了一种启发式的小兵算法。小兵算法通过将PMWL问题转化为路径搜索问题,并借鉴动态剪枝思想,在算法搜索的过程中动态地将不可能的匹配位置剪枝,从而提高解的质量。实验在真实DNA序列上进行,并人工生成了196个模式。结果表明,相比于目前最有效的SAIL算法,小兵算法在绝大多数的尾部有重复字符的模式中可以获得更好的匹配解。 Recently,with the development of bioinformatics,network security and information retrieval,wildcards are playing an increasingly key role in pattern ma-tching problems.Among these,the challenging problem of pattern matching with wildcards and length constraints(PMWL)further extends the flexibility of pattern matching,thus bringing convenience to the user but producing the expense of higher computational complexity.Therefore,how to get a better matching solution in polynomial time has been well studied and developed.Based on the previous work,we presented a heuristic Pawn algorithm for PMWL.After transforming the PMWL problem into path search problem,a dynamic strategy is adopted with a pruning procedure in an alternating iterative manner.Experiments demonstrate that,comparing with the state-of-the-art algorithm,Pawn algorithm can improve the number of matching occurrences in most cases of 196 artificial patterns with certain characteristic and the DNA sequences.
出处 《计算机科学》 CSCD 北大核心 2015年第4期244-248,共5页 Computer Science
基金 国家自然科学基金:港澳学者合作研究基金项目(61229301) 国家自然科学基金项目(60828005) 博士后面上基金项目(2012M511403) 安徽省自然科学基金(2013AKZR0082)资助
关键词 模式匹配 通配符 剪枝 约束 Pattern matching Wildcard Pruning Constraints
  • 相关文献

参考文献19

  • 1Navarro G,Raffinot M.Flexible pattern matching in strings:Practical on-line search algorithms for texts and biological sequences[M].Cambridge,UK:Cambridge University Press,2002.
  • 2Han J,Cheng H,Xin D,et al.Frequent pattern mining:CurrentStatus and Future Directions[J].Data Mining and Knowledge Discovery,2007,15(1):55-86.
  • 3Fischer M J,Paterson M S.String matching and other products[R].Complexity of computation,1974:113-125.
  • 4Cole J R,Chai B,Marsh T L,et al.The ribosomal database project(RDP-11):Sequences and tools for high-throughput rRNA analysis[J].Nucleic Acids Research,2005,33 (1):294-296.
  • 5徐小双,冯玉才,王锋,周英飚,张俊.路径分区编码优化小枝查询[J].计算机科学,2010,37(3):182-187. 被引量:1
  • 6Xie F,Wu X D,Hu X G,et al.Sequential Pattern Mining with Wildcards[C]∥22nd IEEE International Conference on Tools with Artificial Intelligence (ICTAI).2010:241-247.
  • 7Guo D,Hu X G,Xie F,et al.Pattern matching with wildcards and gap-length constraints based on a centrality-degree graph[J].Applied Intelligence,2013,39:57-74.
  • 8Bille P,Gortz I,Vildhoj H,et al.String matching with variable length gaps[J].Theoretical Computer Science,2012,443:25-34.
  • 9Chen G,Wu X D,Zhu X Q,et al.Efficient String Matching with Wildcards and Length Constraints[J].Knowledge and Information Systems,2006,10(4):399-419.
  • 10Min F,Wu X D,Lu Z.Pattern matching with independent wildcard gaps[C]∥Eighth IEEE International Conference on Dependable,Autonomic and Secure Computing (DASC-2009).2009:194-199.

二级参考文献41

  • 1李国良,冯建华,塔娜,周立柱.TwigStar——快速处理XML Twig查询中含通配符*的算法[J].计算机研究与发展,2006,43(z3):430-437. 被引量:3
  • 2高军,杨冬青,唐世渭,王腾蛟.基于树自动机的XPath在XML数据流上的高效执行[J].软件学报,2005,16(2):223-232. 被引量:33
  • 3周军锋,孟小峰,蒋瑜,谢敏.F-Index:一种加速Twig查询处理的扁平结构索引[J].软件学报,2007,18(6):1429-1442. 被引量:4
  • 4Miklau G, Suciu D. Containment and equivalence for a fragment of XPath [J]. Journal of the ACM,2004,51(1) :2-45.
  • 5Busse R, Carey M, Florescu D, et al. Xmark: An XML benchmark project. 2003 [EB/OL]. http://monetdb. cwi. nl/xml/index. html.
  • 6Wang W, Wang H Z , Lu H J, et al. Efficient processing of XML path queries using the disk-based F&B index[C]//Proc, of the 31st Int'l Conf. on Very Large Data Bases (VLDB). 2005:145- 156.
  • 7Chen Ting, Lu Jiaheng, Tok W L. On boosting holism in XML twig pattern matching using structural indexing techniques[C]//Proc. of ACM SIGMOD Int ' l Conf. on Management of data. 2005 : 45-46.
  • 8Lu J H,Tok W L,Chart C Y,et al. From region encoding to extended dewey: On efficient processing of xml twig pattern matching[C]//Proc, of VLDB05. 2005:193-204.
  • 9Shanmugasundaram J, et al. Relational databases for querying XML documents: limitations and opportunities [C]//Proc. of VLDB. 1999:302-314.
  • 10Peter T W. Minimizing simple XPath expressions[C]//Proc, of the 4th Int'l Workshop on Web and Databases (WebDB 2001). ACM Press, 2001 : 13-18.

共引文献10

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部