期刊文献+

基于词向量技术与主题词特征的微博立场检测 被引量:7

Stance Detection in Chinese Microblog Topic Based on Word Embedding Technology and Thematic Words Feature
下载PDF
导出
摘要 微博话题随着移动互联网的发展变得火热起来,单个热门话题可能有数万条评论,微博话题的立场检测是针对某话题判断发言人对该话题的态度是支持的、反对的或中立的.本文一方面由Word2Vec训练语料库中每个词的词向量获取句子的语义信息,另一方面使用Text Rank构建主题集作为话题的立场特征,同时结合情感词典获取句子的情感信息,最后将特征选择后的词向量使用支持向量机对其训练和预测完成最终的立场检测模型.实验表明基于主题词及情感词相结合的立场特征可以获得不错的立场检测效果. With the development of the mobile Internet, Microblog topic has become popular. A single hot topic may have tens of thousands of comments. The stance detection of Microblog topic aims to automatically determine whether the author of a text is in favor of the given target, against the given target, or neither. Firstly, Word2 Vec trains out each word of the corpus of vector to extract semantics information from sentence. Then, Text Rank keywords extracted method is used to construct the thematic words set as the stance's feature, meanwhile, the sentiment lexicon is used to extract the sentiment information of the sentence. Finally, the word vector of feature selection is trained and predicted by Support Vector Machine(SVM), so as to complete the model of stance detection. The experimental result shows that the stance feature based on the combination of thematic words and sentiment words can obtain good stance detection effect.
作者 郑海洋 高俊波 邱杰 焦凤 ZHENG Hai-Yang;GAO Jun-Bo;QIU Jie;JIAO Feng(College of Information Engineering,Shanghai Maritime University,Shanghai 201306,China)
出处 《计算机系统应用》 2018年第9期118-123,共6页 Computer Systems & Applications
关键词 主题词集 立场检测 主题词特征 词向量 立场特征 stance detection thematic words feature word embedding stance feature
  • 相关文献

参考文献5

二级参考文献43

  • 1李盼池,许少华.支持向量机在模式识别中的核函数特性分析[J].计算机工程与设计,2005,26(2):302-304. 被引量:98
  • 2朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 3刘显贵,陈志新.基于核主元分析的支持向量机识别方法研究[J].微计算机信息,2006(09S):304-306. 被引量:6
  • 4姚双云,沈威.关联词的搭配研究[J].计算机与现代化,2007(4):7-9. 被引量:1
  • 5刘知远.基于文档主题结构的关键词抽取方法研究[D].北京:清华大学,2011.
  • 6Mihalcea R, Tarau P. TextRank: Bringing Order into Texts [C]. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain. 2004: 404-411.
  • 7Frank E, Paynter G W, Witten I H, et al. Domain-Specific Keyphrase Extraction [C]. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, Stockholm, Sweden. San Francisco: Morgan Kaufmann Publishers Inc., 1999: 668-673.
  • 8Turney P D. Learning Algorithms for Keyphrase Extraction [J]. Information Retrieval, 2000, 2(4): 303-336.
  • 9Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 10Page L, Brin S, Motwani R, et al. The PageRank Citation Ranking: Bringing Order to the Web [R]. Stanford InfoLab, 1999.

共引文献213

同被引文献38

引证文献7

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部