期刊文献+

基于新闻标题的网络热词发现算法

Algorithm of Network Hot Word Detection Based on News Title
下载PDF
导出
摘要 使用基于PAT-Tree的候选短语提取算法,通过修改PAT-Tree数据结构使之适合处理变长中文字符串及非中文字符。根据交互信息评估字符串的关联程度,并结合新闻报道和网络热词的特点提出向前过滤算法发现网络热词。与其它同类算法相比,本算法不需要制定复杂的语言规则和候选短语的评分公式,实现更加简单、速度更快。实验证明了本文算法的有效性和正确性。 This paper proposes a candidate phrase extraction methods based on PAT-Tree.By modifying the PAT-Tree data structure,the paper makes it suitable for the Chinese string of variable length,then uses mutual information to assess the candidates.Combined with news text's features and characteristics of network hot words,the paper uses a forward filtering method to filter the candidates.Compared with other similar algorithms,our algorithm does not need complex language rules and evaluate formula.The experimental results show that our algorithm is proper and efficient.
作者 郭冲
出处 《计算机与现代化》 2013年第3期58-62,66,共6页 Computer and Modernization
关键词 网络热词 PAT-TREE 互信息 中文字符串 候选短语 network hot word PAT-Tree mutual information Chinese string candidate phrase
  • 相关文献

参考文献16

  • 1吴保珍,何婷婷,李立,张勇,陈龙.基于全切分获取网络流行语方法研究[J].计算机应用研究,2009,26(4):1260-1262. 被引量:2
  • 2李渝勤,孙丽华.面向互联网舆情的热词分析技术[J].中文信息学报,2011,25(1):48-53. 被引量:17
  • 3Bian Guo-Wei, Chen Hsin-His. A new hybrid approach for Chinese-English query translation[ C ]//Proceedings of the First Asia Digital Library Workshop. 1998:156-167.
  • 4Wu Z, Tseng G. ACTS: An automatic Chinese text seg- mentation system for full text retrieval [ J ]. Journal of the American Society for Information Sciences and Technology, 1995,46(2) :83-96.
  • 5Wong Kam-Fai, Li Wenjie. Intelligent Chinese information retrieval-Why is it so difficult? [ C]// Proceedings of the First Asia Digital Library Workshop. 1998:47-56.
  • 6Su Keh-Yih, Chiang Tung-Hui, Chang Jing-Shin. An over- view of corpus-based statistics oriented(CBSO) techniques for natural language processing[J]. Computational Linguis- tics and Chinese Language Processing, 1996,1 ( 1 ) : 101- 157.
  • 7Chien Lee-Feng. PAT-tree-based adaptive keyphrase extrac- tion for intelligent Chinese information retrieval [ J ]. Infor- mation Processing and Management, Elsevier Press, 1999, 35 (4) :501-521.
  • 8Chien Lee-Feng. PAT-tree-based keyword extraction for Chi- nese information retrieval [ C ]/! Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1997:50-58.
  • 9Knuth D E. The Art of Computer Programming: Sorting and searching, Vol 3 [ M ]. Addison-Wesley, Mass, 1973.
  • 10Morrison D R. PATRICIA-Pratical algorithm to retrieve in- formation coded in alphanumeric [ J ]. Journal of the Asso-ciation for Computing Machinery, 1968,15 (4) :514-534.

二级参考文献27

共引文献276

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部