期刊文献+

一种针对网络流式文本数据的匹配算法

A match arithmetic for the data of net flouting text
下载PDF
导出
摘要 本文描述了一种对网络流式数据实时监控的搜索算法,应用有限自动机的原理,实现对任意长度数据流进行多关键字无回溯单遍匹配扫描,且加入概率计算,在一定程度上实现文本的简单模糊语义分析。该算法已被网络过滤软件使用,并有良好表现。 This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text, especially for data stream from Internet. The algorithm consists of constructing a finite state pattern matching machine from the keywords and then using the pattern matching machine to process the text string in a single pass. To some extension, the algorithm implements simple ambiguous syntactic parser in text. The algorithm has been used to improve the speed of some filter software for internet.
出处 《齐齐哈尔大学学报(自然科学版)》 2005年第2期37-41,共5页 Journal of Qiqihar University(Natural Science Edition)
关键词 自动机 状态集 信息获取 关键字 字符串匹配 string pattern matching information retrieval keyword
  • 相关文献

参考文献12

  • 1String Matching:An Aid to Bibliographic Search. Alfred V. Aho and Margaret J. Corasick
  • 2Booth, T.U Sequential Machines and Automata Theory. Wiley,New York, 1967.
  • 3Brzozowski, J.A. Derivatives of regular expressions. J. ACM 11:4 (October 1964);481 ~ 494.
  • 4Bullen, R.H., Jr., and Millen, J.K. Microtext - the design of a microprogrammed finite state search machine for full-text re-trieval.Proc. Fall Joint Computer Conference, 1972;479 ~ 488.
  • 5Fischer, M.J., and Paterson, M.S. String matching and other products. Technical Report 41, Project MAC, M.I.T., 1974.
  • 6Gimpel, J.A. A theory of discrete.patterns and their implementation in SNOBOL4. Comm. ACM 16:2 (February 1973);91 ~ 100.
  • 7Harrison, M.C. Implementation of the substring test by hash-ing. Comm. ACM14:12 (December 1971); 777 ~ 779.
  • 8Johnson, W.L., Porter, J.H., Ackley, S.I., and Ross, D.T. Au-tomatic generation of efficient lexical processors using finite statetechniques. Comm. ACMll:I2 (December 1968);805 ~ 813.
  • 9Kernighan, B.W., and Cherry, L.L. A system for typesetting mathematics. Comm. ACM18:3 (March 1975);151 ~ 156.
  • 10Kleene, S.C. Representation of events in nerve nets. In Automata Studies, C.E. Shannon and J. McCarthy (eds.), PrincetonUniversity Press, 1956; 3 ~ 40.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部