摘要
随着规则数量的急剧增长,表示正则表达式的DFA(Deterministic Finite Automata,确定型有限自动机)容易引起状态空间爆炸,难以满足高速网络的实时处理需求。提出一种高效的正则表达式匹配算法,该算法通过将正则表达式分割为精确串、字符集合以及重复字符3个子集,分别对其进行分区优化及检测,然后再利用结点信息对匹配信号进行连接,即构建一种特殊的状态机DoLFA(Divide-optimize-Link Finite Automata)。理论分析和仿真结果表明,该算法可以大大节省存储空间,并获得较高的吞吐量,且具有较强的扩展性。
With the rapid increase of the number of rules,the DFA used to present regular expression often results in states explosion,so it is very hard to satisfy the requirement of high speed network online processing.This paper proposed an efficient regular expression matching algorithm,which first divides an expression into three subsets:exact string,character class and character repetition,and then optimizes and detects the corresponding blocks,at last links them together with auxiliary node data structure,namely constructing a special state machine DoLFA.Theoretical analysis and simulation shows that this algorithm not only can save more memory space,but also provide high throughput performance and scalability.
出处
《计算机科学》
CSCD
北大核心
2012年第9期14-19,共6页
Computer Science
基金
国家重点基础研究发展计划(2012CB315901)
国家科技支撑计划(2011BAH19B01)资助
关键词
深度包检测
正则表达式
有限自动机
编码
计数器
Deep packet inspection
Regular expression
Finite automata
Coding
Counter