摘要
介绍了一种基于高频字串提取的快速字串交叉模式匹配算法 ,同已有的 KMP、BM等单模式匹配算法和有限自动机等多模式匹配算法相比 ,在字符集Σ较大且字串个数远大于字串最大长度的情况下 ,该算法具有较低的时间复杂度和空间复杂度 ,并适用于字符集较大 。
This paper presented a fast string pattern matching algorithm based on extracting high frequency strings. Compared with the existing algorithms including single pattern algorithms, such as KMP and BM algorithm, and multi pattern matching algorithms, such as DFA, this algorithm has both lower time and space complexity when the size of Σ is large and the number of strings is far more than the max length of string in set U . As Chinese character set is large and the average length of Chinese words is much short, this algorithm is more suitable to process Chinese text.
出处
《上海交通大学学报》
EI
CAS
CSCD
北大核心
2003年第3期420-423,427,共5页
Journal of Shanghai Jiaotong University
基金
国家自然科学基金资助项目 ( 60 0 82 0 0 3 )
关键词
模式匹配
高频字串
算法
pattern matching
high frequency string
algorithm