期刊文献+

一种有效的多关键词词频统计方法 被引量:4

An Efficient Approach for Counting Multiple Keywords Frequency
下载PDF
导出
摘要 针对词频统计的特点,设计了一种多关键词词频统计方法。该方法以一种树形的数据结构来存储待处理关键词集合的信息。实现了多关键词的高效匹配,扫描一次文档就可统计出全部关键词词频信息。通过理论分析与实验表明,其性能比传统的关键词词频统计方法有较大的提高。 For the characteristic of the word frequency statistic, this paper designs an approach for counting multiple keywords frequency. In this method, for taking full use of the redundancy information between keywords, it stores the set of keywords with a kind of data structure of tree form. This method realizes the matching high efficiently of many keywords. Scanning the file once is able to get the information of frequency of all keywords. Through theory analysis and experiment result, its performance is more efficient than other algorithms.
作者 马志柔 叶屹
出处 《计算机工程》 CAS CSCD 北大核心 2006年第10期191-192,203,共3页 Computer Engineering
关键词 模式匹配 多关键词 词频统计 Pattern lnatching Multiple keywords Word frequency slatistic
  • 相关文献

参考文献4

二级参考文献6

  • 1潘金贵等编译.现代计算机常用数据结构和算法[M].南京大学出版社,1992.610~614.
  • 2D E Knuth, J H Morris, V R Pratt. Fast pattern matching in strings. SIAM Journal Computer, 1977, 6(2): 323~350
  • 3R S Boyer, J S Moore. A fast string searching algorithm. Communications of the ACM, 1977, 20(10): 762~772
  • 4Sunday M Daniel. A very fast substring search algorithm. Communications of the ACM, 1990, 33(8): 132~142
  • 5A V Aho, M J Corasick. Efficient string matching: An aid to bibliographic search. Communications of the ACM, 1975, 18(6): 333~340
  • 6Fan Jang-Jong, Su Keh-Yih. An efficient algorithm for match multiple patterns. IEEE Trans on Knowledge and Data Engineering, 1993, 5(2):339~351

共引文献82

同被引文献48

二级引证文献60

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部