摘要
AC(Aho-Corasick)自动机是经典的多模式匹配算法,但在模式串字符集较大的情况下,AC自动机的存储开销较大。为降低存储开销提出了存储优化的多模式匹配算法SMMA,该算法在Trie树建立阶段利用正向表来存储每个状态的后续状态指针以及失配指针,而无需存储字符集所有字符的后继指针,从而压缩了每个状态的储存空间。实验表明,所提出的算法与AC自动机算法在时间效率上相近,但极大地降低了存储开销。
Aho-Corasick automaton is a classic multi-pattern matching algorithm; however, when the character set of patterns becomes rather large, its storage overhead correspondingly will be much higher. To solve this problem, this paper proposes a storage-optimized multi-pattern matching algorithm(SMMA). In build-up phase of Trie, the subsequent state pointer and mismatching pointer of each state will be kept in storage based on the indexed table. The algorithm won't store the subsequent state pointer for entire character set of each state, which contributes to the reduction of storage space. The experiments show that the SMMA can reduce the storage overhead greatly while maintaining its efficiency similar to that of AC automation algorithm.
出处
《微型机与应用》
2015年第2期14-17,共4页
Microcomputer & Its Applications
基金
国家自然科学基金(61170108)
国家级大学生创新创业训练计划项目(201310345005)