期刊文献+

基于词向量的领域情感词典构建 被引量:14

Building of domain sentiment lexicon based on word2vec
原文传递
导出
摘要 针对现有领域情感词典在情感和语义表达等方面的不足,提出一种基于词向量的领域情感词典构建方法。利用25万篇新闻语料和10万余条酒店评论数据,训练得到word2vec模型;选择80个情感明显、内容丰富、词性多样化的情感词作为种子词集;利用TF-IDF值在词汇重要程度的度量作用,在酒店评论中获得9 860个领域候选情感词汇;通过计算候选情感词与种子词的词向量之间的语义相似度,将情感词映射到高维向量空间,实现了情感词的特征向量表示(Senti2vec)。将Senti2vec应用于情感词极性分类和文本情感分析任务中,试验结果表明,Senti2vec能实现情感词的语义表示和情感表示;基于特定领域语料的语义相似计算,使得提取的情感特征更具有领域特性,同时不受候选情感词集范围的约束。 In order to fill the gap of sentimental and semantic representation in domain sentiment lexicon,a construction method of domain sentiment lexicon via word vectors was proposed. The word2 vec model was trained based on 250 thousand news texts and100 thousand hotel review texts. Eighty sentimental words,which possed obvious sentiment,rich content and diverse POS,were chosen as a set of seed words. Meanwhile,9 860 candidate sentimental words among the hotel review texts were acquired via the measuring value of TR-IDF. The semantic similarity between the candidate sentimental words and the seed words was calculated based on their word vectors,and the sentimental words were mapped to the high dimensional vector space and the feature vector representation( Senti2 vec) was extracted. Senti2 vec was applied into the polarity classification of sentimental words and sentimental text analysis. The experimental results showed that Senti2 vec could represent the meaning and sentiment of sentimental words. Senti2 vec was based on semantic similarity calculation from data of specific domain,which enabled this method more adaptable into different domains.
作者 林江豪 周咏梅 阳爱民 陈锦 LIN Jianghao;ZHOU Yongmei;YANG Aimin;CHEN Jin(Laboratory for Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou 510006, Guangdong, China;School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou 510006, Guangdong, China;International College, Guangdong University of Foreign Studies, Guangzhou 510420, Guangdong, China)
出处 《山东大学学报(工学版)》 CAS 北大核心 2018年第3期40-47,共8页 Journal of Shandong University(Engineering Science)
基金 教育部人文社会科学资助项目(14YJA740011) 广东省教育厅科技创新资助项目(2013KJCX0067) 广东省哲学社会科学"十二五"规划资助项目(GD15YTS01) 广东省科技计划资助项目(2017A040406025) 广东外语外贸大学教改资助项目(GWJY2017046)
关键词 领域情感词典 word2vec 情感词 情感特征向量 语义相似度 domain sentiment lexicon word2vec sentiment word sentimental feature vector semantic similarity
  • 相关文献

参考文献11

二级参考文献117

  • 1董振东.语义关系的表达和知识系统的建造[J].语言文字应用,1998(3):79-85. 被引量:59
  • 2许静芳,李星,李粤.信息检索中主题式词典的构建方法[J].计算机工程,2005,31(21):143-145. 被引量:5
  • 3朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 4KU L-W, LO Y-S, CHEN H-H. Using polarity scores of words for sentence-level opinion extraction [ C]// Proceedings of the 6th NTCIR-6 Workshop Meeting. Toyko, Japan: [ s. n. ], 2007:316 - 322.
  • 5王秉卿,张姝,张奇.中文情感词识别[C]//NCIRCS2008:第四届全国信息检索与内容安全学术会议.北京:[出版社不详],2008:63-69.
  • 6刘群 李素建.基于《知网》的词汇语义相似度的计算.中文计算语言学,2002,17(2):59-76.
  • 7王克,张春良,朱慕华,等.基于情感词词典的中文文本主客观分析[C].NCIRCS2008:第四届全国信息检索与内容安全学术会议.北京,2008.56-62.
  • 8知网[EB/OL].[2009-03-12].http://www.keenage.com.
  • 9TURNEY P D. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews [ C]// Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Morristown, N J, USA: Association for Computational Linguistics, 2002:417-424.
  • 10谭松波.中文情感挖掘语料-ChenSentiCorp[EB/OL].(2008-12-19)[2009-03-12].http://www.searchforum.org.cn/tansongbo/corpus-senti.htm.

共引文献534

同被引文献122

引证文献14

二级引证文献56

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部