摘要
在基于有限状态自动机的多模式匹配算法(DFSA算法)基础上,结合Tuned BM算法的优点,提出一个快速的多模式字符串匹配算法,实现了多模式匹配过程中不匹配字符的连续跳跃.在此基础上进一步改进,得到一个最差时间复杂度为线性的匹配算法.分析指出算法实际比较的字符数随着模式串长度的增加而下降,并随模式集的增大有所增多.实验表明,在模式串较短时,算法需要的匹配时间仅为AC算法的1/2到1/3,AQR算法的9/10左右;在模式串较长时,所需时间为AC算法的1/4至1/8,AQR算法的3/4左右.
Combined with the advantages of the Tuned Boyer - Moore algorithm, an effective algorithm for performing multiple patterns matching in a string was put forward on the concept of deterministic finite state automata (DFSA), and achieved better performance by shifting unmatched characters consecutively. Experimental results indicate that, to search a string, the algorithm takes only 1/2 - 1/3 that of AC and 9/10 of AQR in case of short patterns while the ratio is 1/4 - 1/8 and 3/4 in case of long patterns.
出处
《哈尔滨工业大学学报》
EI
CAS
CSCD
北大核心
2007年第12期1925-1929,共5页
Journal of Harbin Institute of Technology
基金
国家自然科学基金资助项目(60203021)