摘要
为改善传统机器学习方法无法考虑文本语义信息的缺陷,利用递归自编码器(RAE)树形结构学习短语向量空间表示。该方法可在常用的数据集上取得良好效果,但是在学习向量表示过程中,往往需要大量标记数据标记每个结点,人工标注工作量较大。因此提出一种半监督方法,利用PMI方法计算终端结点情感极性值,并考虑上下文程度副词和否定词对修饰情感词语的情感倾向与情感强度的影响。实验结果表明,与手动标记的传统RAE模型相比,引入PMI方法标记结点后,准确率提升至88.1%,可一定程度减少人工标注的工作量。
In order to improve the shortcomings of traditional machine learning methods that are difficult to consider text semantic information,the recursive autoencoder(RAE)is used to learn the vector space representation of phrases with its tree structure,and it has achieved good results on commonly used data sets.However,in the process of learning vector representation,a large amount of labeling data is often needed to label each node.This paper proposes a semi-supervised method that uses the PMI method to calculate the emotional polarity value of the terminal node,and considers the influence of degree adverbs and negative words in the context on the emotional tendency and emotional strength of the modified emotional words.The experimental results show that compared with the traditional RAE model,after the PMI method is introduced to label the nodes,the accuracy is increased to 88.1%,and a lot of manual labeling workload is saved.
作者
孙琦
梁永全
SUN Qi;LIANG Yong-quan(College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266590,China)
出处
《软件导刊》
2021年第6期59-62,共4页
Software Guide