Nowadays, using Deterministic Finite Automata (DFA) or Non-deterministic Finite Automata (NFA) to parse regular expressions is the most popular way for Deep Packet Inspection (DPI), and the research about DPI focuses ...Nowadays, using Deterministic Finite Automata (DFA) or Non-deterministic Finite Automata (NFA) to parse regular expressions is the most popular way for Deep Packet Inspection (DPI), and the research about DPI focuses on the improvement of DFA to reduce memory. However, most of the existing literature ignores a special kind of "overlap-matching expression", which causes states explosion and takes quite a large part in the DPI rules. To solve this problem, in this paper a new mechanism is proposed based on bitmap. We start with a simple regular expression to describe "overlap-matching expressions" and state the problem. Then, after calculating the terrible number of exploded states for this kind of expressions, the procedure of Bitmap-based Soft Parallel Mechanism (BSPM) is described. Based on BSPM, we discuss all the different types of "overlap-matching ex- pressions" and give optimization suggestions of them separately. Finally, experiment results prove that BSPM can give an excellent performance on solving the problem stated above, and the optimization suggestions are also effective for the memory reduction on all types of "overlap-matching expressions".展开更多
提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR(regular expressions cut a...提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR(regular expressions cut and combine algorithm based on DR),有效地选择出导致DFA状态膨胀的片段并隔离,降低了单个正则表达式存储需求.同时,基于正则表达式的组合关系提出一种选择性分群算法REGADR(regular expressions group algorithm based on DR),在可以接受的存储需求总量下,通过选择性分群大幅度减少了状态机的个数,有效地降低了匹配算法的复杂性.展开更多
基金Supported by the National High Technology Development 863 Program of China (No. 2008AA01Z117)
文摘Nowadays, using Deterministic Finite Automata (DFA) or Non-deterministic Finite Automata (NFA) to parse regular expressions is the most popular way for Deep Packet Inspection (DPI), and the research about DPI focuses on the improvement of DFA to reduce memory. However, most of the existing literature ignores a special kind of "overlap-matching expression", which causes states explosion and takes quite a large part in the DPI rules. To solve this problem, in this paper a new mechanism is proposed based on bitmap. We start with a simple regular expression to describe "overlap-matching expressions" and state the problem. Then, after calculating the terrible number of exploded states for this kind of expressions, the procedure of Bitmap-based Soft Parallel Mechanism (BSPM) is described. Based on BSPM, we discuss all the different types of "overlap-matching ex- pressions" and give optimization suggestions of them separately. Finally, experiment results prove that BSPM can give an excellent performance on solving the problem stated above, and the optimization suggestions are also effective for the memory reduction on all types of "overlap-matching expressions".
文摘提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR(regular expressions cut and combine algorithm based on DR),有效地选择出导致DFA状态膨胀的片段并隔离,降低了单个正则表达式存储需求.同时,基于正则表达式的组合关系提出一种选择性分群算法REGADR(regular expressions group algorithm based on DR),在可以接受的存储需求总量下,通过选择性分群大幅度减少了状态机的个数,有效地降低了匹配算法的复杂性.