摘要
中文字符的相互独立性导致AC算法的时空性能急剧下降。针对此问题,对AC算法的存储结构进行了改进,提出了一种适合中文的多模式匹配算法———AC_SC算法。该算法以邻接链表存储有限状态自动机,尝试解决存储空间快速膨胀问题,并将状态"0"的长链表转化为散列链表,以提高算法的匹配效率。实验结果表明,AC_SC算法具有良好的时空性能。
Because of the independent of the Chinese characters, the space and time performances of AC algorithm de- cline sharply. For this problem,the storage structure of AC algorithm was improved and a multi-pattern algorithm for Chinese named AC_SC was proposed. The algorithm uses adjacency-list to store the finite state automata to solve the problem of the storage space rapid expansion. Besides, the long linked list of the state '0' is changed into a Hash linked table to improve the matching efficient. Experimental results show that AC_SC has better time and space performances.
出处
《计算机科学》
CSCD
北大核心
2013年第11期117-121,共5页
Computer Science
基金
安徽省自然科学基金(090412051)
广东省教育部产学研结合项目(2008B0905002400)资助
关键词
多模式匹配
AC算法
邻接链表
有限状态自动机
Multi-pattern matching, AC algorithm, Adjacency-list, Finite state automata