期刊文献+

求解PMWOC问题的位并行算法

Bit-parallelism algorithm for PMWOC
下载PDF
导出
摘要 带有灵活通配符和One-Off条件的模式匹配问题(pattern matching with flexible wildcards and One-Off condition,PMWOC)具有重要的理论意义和实际应用价值。给定带灵活通配符的模式和文本,目标是在线的计算模式在文本中的出现次数和匹配位置,这里要求任何两次出现不能共享文本同一位置,即One-Off条件。提出了一个基于位并行的搜索算法,采用了非确定有限自动机(nondeterministic finite automatons,NFA)对文本进行扫描。通过理论和实验证明,与其他解决相同问题的算法对比,该算法取得更好的时间性能和空间性能,而且不受模式长度变化和通配符间距变化影响。 The problem of pattern matching with flexible wildcards and One-Off condition( PMWOC) is of great importance for both theory and practice. Giving a pattern with flexible wildcards and a text,it focused on finding the number of occurrences of a pattern in a text and each occurrence's position online. Here,any two occurrences cannot share the same position of the text,namely the One-Off condition. This paper proposed a search algorithm based on bit-parallelism,which adopted nondeterministic finite automatons( NFA) to stimulate the search process. Compared to the previous peers,theoretical analysis and experimental results show that it can get better performances on both time and space. Moreover,it also has better stability with the increasing of the pattern length or the maximum wildcard constraints.
作者 张浩 叶明全
出处 《计算机应用研究》 CSCD 北大核心 2015年第10期2973-2977,共5页 Application Research of Computers
基金 国家自然科学基金资助项目(60828005) 安徽省高校自然科学研究重点项目(KJ2014A198 KJ2014A266)
关键词 模式匹配 通配符 One-Off条件 pattern matching wildcards One-Off condition
  • 相关文献

参考文献21

  • 1武优西,刘亚伟,郭磊,吴信东.子网树求解一般间隙和长度约束严格模式匹配[J].软件学报,2013,24(5):915-932. 被引量:14
  • 2黄国林,郭丹,胡学钢.基于通配符和长度约束的近似模式匹配算法[J].计算机应用,2013,33(3):800-805. 被引量:5
  • 3Haapasalo T,Silvasti P,Sippu S,et al. Online dictionary matchingwith variable-length gaps [ C ] //Proc of the 10th International Confer-ence on Experimental Algorithms. 2011 : 76-87.
  • 4侯宝剑,谢飞,胡学钢,刘应玲,王海平.基于后缀树的带有通配符的模式匹配研究[J].计算机科学,2012,39(12):177-180. 被引量:7
  • 5吴信东,谢飞,黄咏明,胡学钢,高隽.带通配符和One-Off条件的序列模式挖掘[J].软件学报,2013,24(8):1804-1815. 被引量:23
  • 6Bille P,G0rtz I L, Vildh0j H W, e( al. String matching with variablelength gaps[ J]. Theoretical Computer Science, 2012, 443: 25-34.
  • 7Tabbeer S K, Ahmed C F, Jeong B S, ei al. Efficient frequent pat-tern mining over data streams [ C ] //Proc of the 17th ACM Confer-ence on Information and Knowledge Management. New York : ACMPress, 2008: 1447-1448.
  • 8Akutsu T. Approximate string matching with variable length don ’ tcare characters [ J ]. Information Processing Letters, 1995, 55(5):245-239.
  • 9Manber U, Baeza-yates R. An algorithm for string matching with a se-quence of don't cares[ J]. Information Processing Letters, 1991,37(3) :133-136.
  • 10Fischer M J, Paterson M S. String matching and other products[M ] //Complexity of Computation, vol 7. Karp R M. Cambridge,MA: Massachusetts Institute of Technology, 1974.

二级参考文献58

  • 1邹翔,张巍,刘洋,蔡庆生.分布式序列模式发现算法的研究[J].软件学报,2005,16(7):1262-1269. 被引量:19
  • 2范立新.改进的中文近似字符串匹配算法[J].计算机工程与应用,2006,42(34):172-174. 被引量:8
  • 3Lunteren J V. High-performance pattern-matching for intrusion detection//Proceedings of the 25th IEEE International Conference on Computer Communications ( INFOCOM 2006). Barcelona, Spain, 2006:1-13.
  • 4Califf M E, Mooney R J. Bottom up relational learning of pattern matching rules for information extraction. Journal of Machine Learning Research, 2003, 4(6): 177-210.
  • 5Cole R, Gottlieb L A, Lewenstein M. Dictionary matching and indexing with errors and don't cares//Proceedings of the 36th ACM Symposium on the Theory of Computing. New York, USA, 2004:91-100.
  • 6Cole J R, Chai B, Farris R J, Wang Q, Kulam S A, McGartell D M, Garrity G M, Tiedje J M. The ribosomal database project (RDP-II) : Sequences and tools for high-throughput rRNA analysis. Nucleic Acids Research, 2005, 33(Sup. 1): 294-296.
  • 7Zhang M, Kao B, Cheung D, Yip K. Mining periodic patterns with gap requirement from sequences//Proceedings of the ACM SIGMOD International Conference on Management of Data. Maryland, USA, 2005:623-633.
  • 8Han J, Cheng H, Xin D, Yan X. Frequent pattern mining.. Current status and future directions. Data Mining and Knowledge Discovery, 2007, 15(1) : 55-86.
  • 9Ji X, Bailey J, Dong G. Mining minimal distinguishing subsequence patterns with gap constraints. Knowledge and Information Systems, 2007, 11(3): 259- 286.
  • 10He Y, Wu X, Zhu X, Arslan A N. Mining frequent patterns with wildcards from biological sequences//Proceedings of the 2007 IEEE International Conference on Information Reuse and Integration(IRI-07). Las Vegas, USA, 2007: 329-334.

共引文献47

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部