摘要
为了提高在海量的信息中进行多重复模式查找算法的效率,提出了算法Epattern_searcher.该算法运用过滤算法的思想而设计,同时又采用能节省空间占用的后缀数组来实现,从而提高了算法的运行速度.针对英文小说中高频词的查找问题,对算法进行了实验测试,得到此算法的时间复杂度为O(dg+g.n2.|σ|-q)的实验结果.
This paper introduces an algorithm Epattern-searcher for searching for multi-repetation mode in the Large quantity of information.It is based on the design of filter algorithm,and an index suffix array was used,which can save storiage space to improve the speed of the algorithm.The algorithm was tested by searching for high frequant words in an English novel the algorithm,the result of time complexity is O(d+g/g·n2·|σ|-q).
出处
《甘肃联合大学学报(自然科学版)》
2010年第6期50-55,共6页
Journal of Gansu Lianhe University :Natural Sciences
关键词
编辑距离
过滤
模式匹配
后缀数组
edit distance
filtration
pattern matching
suffix arrays