摘要
为实现网络安全检测中大规模正则表达式的匹配,分析了在从非确定型有限自动机(NFA)到确定型有限自动机(DFA)的子集构造过程中导致状态爆炸性增长的原因,并提出了一种高效的正则表达式匹配方法。这种方法通过将部分DFA状态转变成受限的NFA状态来消除状态数量的剧烈增长,并会形成一种DFA状态与受限的NFA状态交替出现的有限自动机,称为DNFA。DNFA将DFA与NFA结合在一起,实现匹配速度与内存空间占用的平衡,其多层结构也更加适合复杂正则表达式规则。实验结果表明,上述方法可以在大大减少内存需求的情况下,实现正则表达式的高效匹配。
To realize the large-scale regular expression matching in network security inspection, the cause of the state "explosion" during the subset construction process from the nondeterministic finite automation (NFA) to the deter- ministic finite automation (DFA) is analyzed, and then the DNFA, an efficient regular expression matching method is proposed. This method avoids the dramatic growth of the states by transforming part DFA states into limited NFA states, thus the DNFA, a finite automation with the DFA state-limited NFA alternation, is formed. The DNFA takes advantage of the high processing efficiency of the DFA and the compact representation of the NFA to achieve a better trade-off between the memory space and the matching time. It can make a fine granularity splitting of rule set, and its multi-level structure is more suitable for complex regular expression rules in network applications. The experimental result shows that this proposal can provide a high throughout with a moderate memory requirement.
出处
《高技术通讯》
CAS
CSCD
北大核心
2014年第6期551-557,共7页
Chinese High Technology Letters
基金
科技支撑计划(2012BAH37B02
2012BAH42B02)
863计划(2012AA03001)
242计划(2013A012
2013A133)资助项目
关键词
深度包检测
正则表达式
子集分割
有限自动机
混合自动机
deep packet inspection, regular expression, subset splitting, finite automaton, hybrid finite automaton