

Algorithm for Pattern Matching with Independent Wildcard Gaps
摘要 模式匹配作为一种关键技术已被广泛应用于生物序列分析和文本过滤等领域.通配符间隔可以匹配特定长度子序列,为模式匹配问题带来了更多的灵活性.为增加灵活性和一般性给出一种新的模式匹配问题定义,其中通配符间隔可以独立设置,并基于模式分解设计出一种有效的计算匹配数量的算法.实验结果显示,与同类算法相比本算法在性能上具有更优的时间复杂度和空间复杂度. Pattern matching is critical in some applications such as biological sequence analysis and text filtering.A wildcard gap matches any subsequence with a length in a specified interval,and introduces much adaptability to patterns.However,most existing works require the identical gaps in a pattern.In this paper,we define a new pattern matching problem where gaps are independently specified in order to improve the flexibility and generality.We develop an efficient algorithm to compute the number of all matches based on pattern decomposed.The experimental results show that our algorithm has better performance in the aspects of time complexity and space complexity compared with the algorithms in the same fields.
出处 《成都大学学报(自然科学版)》 2014年第3期238-241,共4页 Journal of Chengdu University(Natural Science Edition)
关键词 模式匹配 通配符 间隔 pattern matching wildcard gap
  • 相关文献


  • 1Altschul S F,Gish W,Miller W.Basic local alignment search tool[J].Journal of Molecular Biology,2010,2(15):403-410.
  • 2Zhang M,Kao B,Cheung D W,et al.Mining periodic patterns with gap requirement from sequences[J].ACM Transations on Knowledge Discovery from Data,2007,1(2):1-39.
  • 3Kim S,Bhan A,Maryada B K,et al.EGGS:Extraction of gene clusters using genome context based sequence matching techniques[C]//Proceedings of the IEEE international conference on bioinformatics and biomedicine,BIBM 2007.Silicon Valley,CA,USA:IEEE Computer Society,2007:23-28.
  • 4Zhang J Y,Yang C H.Pattern matching with wildcard gaps based on cross list[C]/ / Proceedings of 2013 sixth international symposium on computational intelligence and design.Hangzhou,China:IEEE Press,2013:154-156.
  • 5张君雁,闵帆.采用填充字符的频繁序列模式挖掘算法[J].成都大学学报(自然科学版),2013,32(2):134-137. 被引量:2
  • 6Tanbeer S K,Ahmed C F,Jeong B S,et al.Efficient frequent pattern mining over data streams[C]//Proceeding of the 17th ACM Conference on Information and Knowledge Management.Napa Valley,California,USA:ACM Press,2008:1447-1448.
  • 7He D,Wu X.SAIL-APPROX:An efficient on-line algorithm for approximate pattern matching with willdcards and length constraints[C]//Proceedings of the IEEE international confernce on bioinformatics and biomedicine.Fremont,CA:IEEE Computer Society,2007:151-158.
  • 8Wu X D,Min F.Pattern mining with independent wildcard gaps[C]//IEEE international conference on dependable,Autonomic and Secure Computing,2009 (DASC 2009).Chengdu,China:IEEE Press,2009:194-199.


  • 1NOBI.National centerfor biotechnology information website[EB/OL].[2013-01-01].http://www.ncbi.nlm.nih.gov/.
  • 2Zhang M,Kao B,Cheung D W,et al.Mining periodic patternswith gap requirement from sequences[C]//Proceedings of ACMTransactions onKnowledge Discovery from Data(TKDD).Balt-imore,Maryland:ACM Press,2007:623-633.
  • 3Pei J,Han J,Mortazavi B,et al.Prefixspan:mining sequentialpatterns by prefix-projected growth[C]//Proceedings 17th IEEEInternational Conference on Data Engineering(ICDE).Heide-lberg,Germany:IEEE Press,2001:215-224.
  • 4Ferreira P G,Azevedo P J.Protein sequence pattern mining withconstraints[C]//Proceedings of the European Conference on Ma-chine Learning and Principles and Practice of Knowledge Discov-ery in Databases.Porto,Portugal:Springer-Verlag,2005:96-107.
  • 5Coward E,Drabl F.Detecting periodic patterns in biological se-quences[J].Bioinformatics,1998,14(6):498-507.
  • 6Herzel H,Weiss O,Trifonov E N.10-11 bp periodicities incomplete genomes reflect protein structure and DNA folding[J].Bioinformatics,1999,15(3):187-193.
  • 7Zhu X,Wu X.Discovering relational patterns across multipledatabases[C]//Proceedings of IEEE 23rd International Confer-ence on Data Engineering.Istanbul:IEEE Press,2007:726-735.
  • 8Agraual R,Imielinski T,Swami A.Mining association rules be-tween sets of items in large databases[C]//Proceedings of ACMSIGMOD International Conference on Management of Data.NewYork,ACM Press,1993:207-216.
  • 9Agrawal R,Srikant R.Mining sequential patterns[J].Data Eng-ineering,2009,12(3):3-14.
  • 10Wijaya E,Rajaraman K,Yiu S M,et al.Detection of genericspaced mofifs using submotif pattern mining[J].Bioinformatics,2007,23(12):1476-1485.









使用帮助 返回顶部