基于DFA结构的高速并行正则表达式匹配算法被引量：2

Parallel Architecture for High Throughput DFA-based Regular Expression Matching Algorithm

下载PDF

导出

摘要针对正则表达式匹配速度低的问题,提出一种基于DFA结构的并行匹配算法.正则表达式匹配过程中,DFA的一部分状态访问次数多而另一部分状态访问次数少.因此,建立数学模型,应用马尔科夫链求解各个状态的访问概率,从而将DFA的状态分成前端和后端两个部分.通过多个前端部分共用一个后端部分的方法实现多个数据流的并行处理,达到了提高匹配速度的目的.算法分析与实验表明在多消耗60%-80%的存储空间时,能够提高4.2-4.6倍的匹配速度. Regular expression matching technology has a problem of low throughput, so a parallel architecture based on DFA was pres- ented. From the fact that some states of DFA are visited frequently but others rarely in the process of regular expression matching, a mathematical model and Markov chain were introduced to calculate the access probability of each state. So the DFA was partitioned into head DFA and tail DFA, and variable head DFA share a tail DFA in order to process multi-flows simultaneity. The result of simulation shows the algorithm can achieve 4.2-4.6 times matching speed in a condition of extra 60%-80% times memory.

作者李鲲鹏兰巨龙李玉峰

机构地区国家数字交换系统工程技术研究中心

出处《小型微型计算机系统》 CSCD 北大核心 2013年第5期1050-1053,共4页 Journal of Chinese Computer Systems

基金国家"九七三"重点基础研究发展计划项目(2012CB315901)资助

关键词正则表达式确定有限自动机访问概率前端部分后端部分匹配速度 regular expression DFA probability of each state visited head DFA tail DFA matching speed

分类号 TP393 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献12

1Yu F, Chen Z, Diao Y, et al. Fast and memory-efficient regular expression matching for deep packet inspection[C]. In: Proc. of ANCS, SanJose, USA: ACM Press, 2006: 93-102.
2Kumar S, Dharmapurikar S, Yu Fang, et al. Algorithms to ac-celemte multiple regular expressions matching for deep packet in-spection[C]. In: Proc. of SIGCOMM, Pisa, IT: ACM Press, 2006: 339-350.
3Ficarn D, Giordano S, Procissi G, et al. An improved DFA for fast regular expression matching[J]. ACM SIGCOMM Computer Communication Review, 2008, 38(5): 2940.
4Becchi M, Cadambi S. Memory-efficient regular expression search using state merging[CJ. In: Proc. of INFOCOM, Anchorage, USA: IEEE Press, 2007: 1064-1072.
5Ficara D, Giordano S, Procissi G,et al. An improved DFA for fast regular expression matching[J]. ACM SIGCOMM Computer Communication Review ,2008, 38 (5) :2940.
6Kumar S, TurnerJ, WilliamsJ. Advanced algorithms for fast and scalable deep pecket inspection[C]. In: Proc. of ANCS, SanJo-se, USA: ACM Press, 2006: 81-92.
7Zhang S Z, Luo H, Fang B X, et al. Fast and memory-efficient regular expression matching using transition sharing[J]. IEICE Trans. on Information and Systems ,2009 ,E92-D(lO) : 1953-1960.
8于强,霍红卫.一组提高存储效率的深度包检测算法[J].软件学报,2011,22(1):149-163. 被引量：14
9Becchi M, Crowley P. A hybrid finite automaton for practical deep packet inspection[C]. In: Proc. of the 2007 ACM CoNEXT Con-ference, New York, USA: ACM Press, 2007: 1-12.
10Sailesh Kumar, Balakrishnan Chandrasekaran,Jonathan Turner, et al. Curing regular expressions matching algorithms from insomnia, amnesia and acalculia[CJ. In: Proc. of ANCS, Princeton, USA: ACM Press, 2007: 155-164.

二级参考文献16

1李伟男,鄂跃鹏,葛敬国,钱华林.多模式匹配算法及硬件实现[J].软件学报,2006,17(12):2403-2415. 被引量：42
2Aho AV, Corasick MJ. Efficient string matching: An aid to bibliographic search. Communications of the ACM, 1975,18(6): 333-340. [doi: 10.1145/360825.360855].
3Li WN, E YP, Ge JG, Qian HL. Multi-Pattern matching algorithms and hardware based implementation. Journal of Software, 2006, 17(12):2403-2415 (in Chinese with English abstract), http://www.j os.org.cn/1000-9825/17/2403.htm [doi: 10.1360/j os 172403 ].
4Hopcroft JE, Motwani R, Ullman JD. Introduction to Automata Theory, Languages, and Computation. 3rd ed., Reading: Addison Wesley, 2006.
5Hopcroft J. An O(n log n) algorithm for minimizing states in a finite automaton. Technical Report, STAN-CS-TR-71-190, Stanford: Stanford University, 1971.
6Yu F, Chen ZF, Diao YL, Lakshman TV, Katz RH. Fast and memory-efficient regular expression matching for deep packet inspection. In: Bhuyan LN, Dubois M, Eatherton W, eds. Proe. of the 2006 ACM/IEEE Symp. on Architecture for Networking and Communications Systems. New York: ACM, 2006.93-102. [doi: 10.1145/1185347.1185360].
7AbuHmed T, Mohaisen A, Nyang D. A survey on deep packet inspection for intrusion detection systems. Magazine of Korea Telecommunication Society, 2007,24(11):25-36.
8BrodieBC, Cytron RK, Taylor DE. A scalable architecture for high-throughput regular-expression pattern matching. In: Kaeli D, ed. Proc. of the 33rd Int'l Symp. on Computer Architecture. New York: ACM, 2006. 191-202. [doi: 10.1109/ISCA.2006.7].
9Becchi M, Crowley P. An improved algorithm to accelerate regular expression evaluation. In: Yavatkar R, Grunwald D, Ramakrishnan KK, eds. Proc. of the 2007 ACM/IEEE Symp. on Architecture for Networking and Communications Systems. New York: Association for Computing Machinery, 2007. 145-154. [doi: 10.1145/1323548.1323573].
10Kumar S, Dharmapurikar S, Yu F, Crowley P, Turner J. Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In: Rizzo L, Anderson T, McKeown N, eds. Proc. of the 2006 Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications. New York: Association for Computing Machinery, 2006. 339-350. [doi: 10.1145/1159913.1159952].

共引文献13

1李鲲鹏,兰巨龙,李印海.基于Bloom filter的高效正则表达式匹配算法[J].计算机应用研究,2012,29(3):950-954. 被引量：4
2杜文超,陈庶樵,胡宇翔.基于DoLFA的高效正则表达式匹配算法[J].计算机科学,2012,39(9):14-19.
3魏德志,洪联系,林丽娜,吴旭.一种改进的XFA在深度包检测中的应用[J].计算机工程与应用,2012,48(34):245-248. 被引量：1
4赵博,郭虹,刘勤让,邬江兴.基于加权累积和检验的加密流量盲识别算法[J].软件学报,2013,24(6):1334-1345. 被引量：40
5林丽娜,洪联系.一种基于多线程的混合深度包检测方法[J].集美大学学报（自然科学版）,2014,19(6):472-476.
6刘志强,张治中.LTE网络中业务识别方案的研究与实现[J].电视技术,2015,39(1):99-102. 被引量：5
7段惠超,韩建民,邱晟.一种存储优化的多模式匹配算法[J].微型机与应用,2015,34(2):14-17.
8朱贺军,祝烈煌.Twitter加密网络行为自动识别方法[J].计算机工程,2015,41(12):166-170.
9张伟,许海洋.一种面向深度包检测的DFA压缩算法[J].计算机应用研究,2017,34(5):1525-1530.
10魏凌波,冯晓兵,张驰,盛化龙,俞能海.基于前缀保持加密的网络功能外包系统[J].通信学报,2018,39(4):159-166. 被引量：1

同被引文献19

1范慧萍,宣蕾,陈曙晖,黄高平.基于正则表达式的应用层协议识别加速[J].计算机研究与发展,2008,45(z1):438-443. 被引量：9
2L7-filter[ EB/OL]. 2014-03-20. http ://17-filter. clearfoun- dation, corn/.
3Snort user manual,the snort project[ EB/OL]. 2014-03-20. https://www, snort, org/documents/snort - users - manual/ snort_manual, pdf.
4Bro manual[ EB/OL]. 2014-05-01. https ://www. bro. org/ sphinx/index, html.
5Yu Fang, Chen Zhifeng,Diao Yanlei. Fast and memory-effi- cient regular expression matching for deep packet inspection [ R ]. Berkeley : University of California, Berkeley ,2006.
6JeffreyEF.精通正则表达式[M].余晟,译.第3版.北京:电子工业出版社,2012.
7Hopcrof J, Ullman F. Introduction to automata theory, langua- ges,and computation [ M ]. 3rd ed. Boston: Addison Wesley, 2007.
8Gong Y, Liu Q, Shao X, et al. A novel regular expression matc- hing algorithm based on multi- dimensional finite automata [ C]//Proc of 15th international conference on high perform- ance switching and routing. [ s. 1. ] :IEEE ,2014:90-97.
9Liu Yanbing, Guo Li, Guo Muyi, et al. Accelerating DFA con- struction by hierarchical merging[ C]//Proc of ninth IEEE in- ternational symposium on parallel and distributed processing with applications. [ s. h ] :IEEE,2011.
10Yu Xiaodong. Deep packet inspection on large datasets algo- rithmic and parallelization techniques for accelerating regular expression matching on manycore[ D]. Missouri : University of Missouri ,2013.

引证文献2

1李朋飞,陈曙晖,徐成成.一种正则表达式编译器优化技术[J].计算机技术与发展,2015,25(9):123-128.
2高阳阳,徐烈伟,俞剑,许薇.一种新型动态可重构的正则表达式匹配引擎设计[J].复旦学报（自然科学版）,2019,58(6):706-718. 被引量：5

二级引证文献5

1张军,王芬芬.基于正则表达式的日志解析系统构建研究[J].无线互联科技,2020,17(3):48-49.
2周敦凯.基于可信计算的嵌入式系统安全设计[J].自动化与仪器仪表,2020(10):81-84. 被引量：4
3栾岚,胡丹.基于信号解释Petri网的可重构调度模型分析研究[J].计算机仿真,2021,38(12):411-414. 被引量：1
4李俊儒,张伟,陈昆明,徐涛.基于预过滤结构的正则表达式硬件专用匹配引擎[J].北京信息科技大学学报（自然科学版）,2022,37(1):50-57. 被引量：1
5刘颖辉,迟学斌,姜金荣,张峰.一种Fortran到CUDA C的转换方法[J].计算机系统应用,2022,31(5):351-357.

1郑明辉.多用途数/模双显式智能仪表的系统设计[J].自动化与仪表,2001,16(5):17-19.
2纪淑波,刘晶,任凤飞.视频监控数据采集器[J].电子世界,2007(6):25-27.
3高磊,屈媛,张建国,张皓,张琛,赵文强.基于Arduino的智能家居环境监测系统设计[J].数字技术与应用,2015,33(6):154-154. 被引量：7
4戴华,李乔良.一种有效的多模式并行匹配算法[J].电脑知识与技术,2007(3):1373-1375. 被引量：1
5朱光,张军亮.基于提升小波变换和SVD的音频水印算法[J].计算机应用与软件,2014,31(2):149-152. 被引量：4
6邹志文,李桥,王政辉,费洪哲.基于订阅划分的多维索引并行匹配算法[J].华中科技大学学报（自然科学版）,2013,41(S2):184-187. 被引量：3
7汪日伟,温显斌,张桦,刘凤连,邹永廖,郑永春.基于统计模型的三线阵CCD影像并行匹配算法[J].光电子．激光,2012,23(6):1196-1200. 被引量：3
8徐光祐,李经纬,杨士强,石定机.面向流水线结构的并行匹配算法[J].计算机学报,1993,16(2):113-121. 被引量：4
9刘增力,李雯霏,朵琳.一种人体互动无线系统的设计[J].昆明理工大学学报（理工版）,2008,33(5):29-32.
10杨敬安,张奠成.基于特征的立体视觉并行匹配算法的研究[J].电子学报,1991,19(1):32-36. 被引量：9

小型微型计算机系统

2013年第5期

浏览历史

内容加载中请稍等...

基于DFA结构的高速并行正则表达式匹配算法被引量：2

参考文献12

二级参考文献16

共引文献13

同被引文献19

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

基于DFA结构的高速并行正则表达式匹配算法 被引量：2

参考文献12

二级参考文献16

共引文献13

同被引文献19

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

基于DFA结构的高速并行正则表达式匹配算法被引量：2