期刊文献+

一种面向大规模特征集的高效特征匹配算法 被引量:1

A fast string matching algorithm for large-scale pattern sets
下载PDF
导出
摘要 针对传统特征匹配(网络和信息安全系统的核心技术)算法的性能随着特征集规模的不断增大而不断下降的问题,提出了一种面向大规模特征集的高效特征匹配算法ALPM。该算法基于传统算法WM的跳跃思想,并结合硬件体系结构的特点,对预处理过程和匹配过程分别采用了不同的优化策略,如采用不同的哈希函数索引Shift表和Hash表,在预处理过程中动态截取特征标志,在匹配过程中结合Cache大小和特征集规模调整哈希函数冲突概率等,以提高匹配的性能。实验结果表明,针对大规模特征集,ALPM算法匹配性能比经典算法提高5~10倍。 In view of the problem that the performance of the classical pattern matching (one of the key technologies for network and information security systems) algorithms degrades seriously when the patterns become large, especially over 50000, this paper proposes a new architectural large-scale pattern matching algorithm (ALPM) for large-scale pattern sets. Based on the shift concept of the classical Wu-Manber (WM) algorithm and combined with its features of hardware architecture, the ALPM adopts several pre-processing and matching strategies, such as utilizing two different Hash functions to access the Shift and hash tables, optimizing pre-processing to choose the best entry signs from patterns for the two tables and ad- justing the Hash confliction dynamically with the Cache size and the pattern quantity, to improve the matching performance. The experimental results show that for the large-scale pattern set, the matching performance of the ALPM is 5 - 10 times higher than that of the classical WM.
出处 《高技术通讯》 EI CAS CSCD 北大核心 2009年第6期551-557,共7页 Chinese High Technology Letters
基金 863计划(2007AA01Z468)资助项目
关键词 大规模特征集 特征匹配 字符串匹配 哈希冲突 多线程技术 large-scale pattern set, pattern matching, string matching, hash confliction, multi-threading
  • 相关文献

参考文献16

  • 1Roesh M. Sno, t hghtweight intrusion detection for Networks. In: Proceedings of the 13th Systems Administration Conference, USENIX, Seattle, Washington, USA, 1999. 229-238.
  • 2ClamAV. Clam AntiVirus. http://www. clamav. net: ClamAV, 2002.
  • 3Fisk M, Varghese G. An analysis of fast string matching applied to content-based forwarding and intrusion detection: [technical report CS2001-0670]. San Diego: University of California-San Diego, 2002.
  • 4Jari K, Leena S, Jorma T. Tuning string matching for huge pattern sets. In: Proceedings of the 14th Annual Combinatorial Pattern Matching (CPM) Symposium, Morelia, Mexico,2003. 211-224.
  • 5Aho A V, Corasick M J. Efficient string matching: an aid to bibliographic search. Commuaications of the ACM, 1975,18 (6) : 333-340.
  • 6Coit C J, Stanfford S, McAlemey J. Towards faster string matching for intrusion detection or exceeding the speed of Snort. In: Proceedings of the DARPA Information Survivability Conference and Exposition Ⅱ (DISCEX'01), Los Alamitos, CA, USA, 2001. 367-373.
  • 7Boyer R, Moore J. A fast string searching algorithm. Communications of the ACM, 1977,20(10) :762-772.
  • 8Wu S, Manber U. A fast algorithm for multi-pattern searchirg: [technical report TR-94-17]. Tucson: University of Arizona, 1994.
  • 9Blumer A, Blumer J, Ehrenfeucht A, et al. Complete inverted files for efficient text retrieval and analysis, Journal of the ACM, 1987,34(3) :578-595.
  • 10Allauzen C, Raffinot M. Factor oracle of a set of words: [ technical report 99-11 ]. Institute Gaspard-Monge, University de Marne-la-vallee, 1999.

二级参考文献26

  • 1[1]RS Boyer, J S Moore. A fast string searching algorithm.Communications of ACM, 1977, 20(10): 762~772
  • 2[2]A Aho, M Corasick. Efficient string matching: An aid to biliographic search. Communications of ACM, 1975, 18(6): 333~ 340
  • 3[3]B Commentz-Walter. A string matching algorithm fast on average.In: H A Maurer ed. Proc of the 6th Int'l Colloquium on Automata, Languages, and Programming, LNCS 71. Berlin:Springer, 1979. 118~132
  • 4[5]E Ukkonen. On-line construction of suffix trees. Algorithmica,1995, 14(3): 249~260
  • 5[6]Bruce W Watson. The performance of single-keyword and multiple-keyword pattern matching algorithms. Eindhoven University of Technology, Eindhoven, the Netherlands, Tech Rep: 94/19, 1994
  • 6Boyer RS, Moore JS. A fast string searching algorithm[ M]. Communications of the ACM20, 1977. 762- 772.
  • 7Sun W, Manber U. A Fast Algorithm For Multi-pattern Searching[ D]. The Computer Science Department of The University of Arizona, 1994.
  • 8Sun W, Manber U. Agrep-A Fast Approximate Pattem-matching Tool[M]. Usenix Winter Technical Conference, 1992.
  • 9Kim S. A Fast Multiple String - Pattern Matching Algorithm [ A ] .17th AoM/IAoM International Conference on Computer Science[ C].San Diego CA, August 1999.
  • 10Gonzalo Navarro and Mathieu Raffinot, Flexible Pattern Matching in Strings[ M ]: Practical on-line search algorithms for texts and biological sequences, Cambridge University Press, 2002, ISBN 0 - 521 - 81307 - 7.

共引文献47

同被引文献5

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部