摘要
多个正则表达式规则编译成一个DFA(deterministerfiniteautomata)时,会产生状态爆炸、存储急剧增加的现象。针对最严重的状态爆炸问题,从信息论的角度给出了解释,并提出多维数学模型,将冗余状态分为0维状态和1维状态,通过前者按照维度压缩,后者动态构建的方法将空间复杂度降到理论下界,并在此基础上提出多维有限白动机(MFA,multi—dimensionalfiniteautomata)。实验表明,MFA构造时间比XFA略少,比DFA、STT冗余压缩算法和Hybrid.FA降低了2~3个数量级;存储空间比XFA略高,比DFA、STT冗余压缩算法、mDFA、Hybrid-FA降低了1-2个数量级;匹配时间比DFA、Hybrid.FA略多,但是比XFA略少,比sTT冗余压缩算法和mDFA降低了1-2个数量级。
Compiling multiple regular expression signatures into a combined DFA can blowup in state and storage space. Explanations from the prospective of information theory and multi-dimensional mathematical model were proposed fo- cusing on the most serious state explosion. Redundancy states were divided into zero-dimensional ones and one-dimensional ones. The former were compressed by dimension, and the later were dynamically built. The space com- plexity of the model came to the theoretical lower bound by the above methods. Then the multi-dimensional finite auto- mata (MFA) was proposed with the model. Experiments show that, the construction time taken by MFA is slightly less than XFA and is 2-3 orders of magnitude lower than DFA, STT redundancy compression algorithms and Hybrid-FA; the memory space of MFA is slightly higher than XFA, but is 1-2 orders of magnitude lower than DFA, STT redundancy compression algorithms, mDFA and Hybrid-FA; in the aspect of matching time, MFA ranks after DFA and Hybrid-FA, but ranks before XFA, and it achieves 1-2 orders of magnitude lower than that of STT redundancy compression algo- rithms and mDFA.
出处
《通信学报》
EI
CSCD
北大核心
2015年第5期174-186,共13页
Journal on Communications
基金
国家高技术研究发展计划("863"计划)基金资助项目(2011AA01A103
2011AA01A101)
国家重点基础研究发展计划("973"计划)基金资助项目(2012CB315901
2013CB329104)
国家科技支撑计划基金资助项目(2011BAH19B01)~~