期刊文献+

基于互信息的微博新词发现算法 被引量:1

下载PDF
导出
摘要 微博是一种近些年来兴起的互联网媒体,每时每刻都会产生各种新生的网络词汇。对于新词发现算法中表现出的缺点,文中提出了一种基于互信息的微博新词发现算法,将互信息合并多字词的方式应用到微博新词的发现中,并且通过实验验证了本文算法对于微博新词发现的有效性。 Micro-blog is a new kind of social network, a variety of nascent network vocabulary is produced at all times. In order to make up for these deficiencies in the previous new word detection algorithms, this paper presents a new word detection algorithm in micro-blog based on mutual information. In this algorithm, the mutual information with multiple word is applied to the micro-blog new word detection. The experiments show that this algorithm is more effective for micro-blog new word detection.
出处 《科技视界》 2015年第15期137-137,145,共2页 Science & Technology Vision
关键词 微博 新词发现 互信息 Micro-blog New word detection Mutual Information
  • 相关文献

参考文献5

二级参考文献22

共引文献72

同被引文献17

  • 1CHEN A. Chinese word segmentation using minimal linguistic knowledge[C]//Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2003: 148-151.
  • 2ZHANG W, YOSHIDA T, TANG X, et al. Improving effectiveness of mutual information for substantival multiword expression extraction[J]. Expert Systems with Applications, 2009, 36(8): 10919-10930.
  • 3BU F, ZHU X, LI M. Measuring the non-compositionality of multiword expressions[C]//COLING '10: Proceedings of the 23rd International Conference on Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2010: 116-124.
  • 4HUANG M, YE B, WANG Y, et al. New word detection for sentiment analysis[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2014: 531-541.
  • 5ZHOU G. A chunking strategy towards unknown word detection in Chinese word segmentation[C]//IJCNLP 2005: Proceedings of the Second International Joint Conference on Natural Language Processing, LNCS 3651. Berlin: Springer-Verlag, 2005: 530-541.
  • 6PENG F, FENG F, MCCALLUM A. Chinese segmentation and new word detection using conditional random fields[C]//COLING '04: Proceedings of the 20th International Conference on Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2004: Article No. 562.
  • 7ZHENG Y, LIU Z, SUN M, et al. Incorporating user behaviors in new word detection[C]//IJCAI '09: Proceedings of the 21st International Joint Conference on Artifical Intelligence. San Francisco, CA: Morgan Kaufmann Publishers, 2009: 2101-2106.
  • 8ZHANG Y, SUN M, ZHANG Y. Chinese new word detection from query logs[C]//ADMA 2010: Proceedings of the 6th International Conference on Advanced Data Mining and Applications, LNCS 6441. Berlin: Springer-Verlag, 2010: 233-243.
  • 9CHEN K-j, MA W-Y. Unknown word extraction for Chinese documents[C]//COLING '02: Proceedings of the 19th International Conference on Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2002,1: 1-7.
  • 10LI H, HUANG C-N, GAO J, et al. The use of SVM for Chinese new word identification[C]//IJCNLP 2004: Proceedings of the 1st International Joint Conference on Natural Language Processing, LNCS 3248. Berlin: Springer-Verlag, 2004: 723-732.

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部