期刊文献+

基于词向量的情感新词发现研究

The Research of Emotional New Word Discovery Based on Word Vector
下载PDF
导出
摘要 本文首先用分词工具对收集的大量语料文档进行预处理,并进行分词和词性标注;其次,编写脚本对已经词性标注的语料库按照情感词的词性进行提取,建立候选情感词库,并用候选情感词库与外部情感词库取交集得到基准情感词表;再次,用Word2Vec工具对自己创建的候选情感词库进行词向量训练,参照基准情感词表,计算情感词之间的distance值;最后,比较distance值判定情感词,即值越大则词汇之间的语义相似度就越高,从而按照距离远近选择情感新词。 Firstly,this paper preprocesses a large number of collected corpus documents with word segmentation tools,and carries out word segmentation and part of speech tagging;Secondly,a script is written to extract the part of speech labeled corpus according to the part of speech of emotional words,establish a candidate emotional thesaurus,and use the intersection between the candidate emotional thesaurus and the external emotional thesaurus to obtain the benchmark emotional thesaurus;Thirdly,use Word2Vec tool to train the word vector of the candidate emotional thesaurus created by yourself,and calculate the distance value between emotional words with reference to the benchmark emotional thesaurus;Finally,compare the distance value to determine the emotional words,that is,the greater the value,the higher the semantic similarity between words,so as to select the emotional new words according to the distance.
作者 胡创业 HU Chuangye(Xinjiang Normal University,Urumqi Xinjiang 830054,China)
机构地区 新疆师范大学
出处 《信息与电脑》 2021年第17期50-52,共3页 Information & Computer
基金 汉语-乌兹别克语平行语料库自动构建方法研究(项目编号:XJNUSYS2019B10)。
关键词 情感新词 分词 Word2Vec 词向量 emotional neologism word segmentation Word2Vec word vector
  • 相关文献

参考文献6

二级参考文献51

共引文献56

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部