期刊文献+

基于RoBERTa与句法信息的中文影评情感分析 被引量:3

Sentiment Analysis of Chinese Film Reviews Based on RoBERTa and Syntactic Information
下载PDF
导出
摘要 细粒度情感分析是自然语言处理的关键任务之一,针对现有的解决中文影评情感分析的主流方案一般使用Word2Vector等预训练模型生成静态词向量,不能很好地解决一词多义问题,并且采用CNN池化的方式提取文本特征可能造成文本信息损失造成学习不充分,同时未能利用文本中包含的长距离依赖信息和句子中的句法信息。因此,提出了一种新的情感分析模型RoBERTa-PWCN-GTRU。模型使用RoBERTa预训练模型生成动态文本词向量,解决一词多义问题。为充分提取利用文本信息,采用改进的网络DenseDPCNN捕获文本长距离依赖信息,并与Bi-LSTM获取到的全局语义信息以双通道的方式进行特征融合,再融入邻近加权卷积网络(proximity-weighted convolutional network,PWCN)获取到的句子句法信息,并引入门控Tanh-Relu单元(gated Tanh-Relu unit,GTRU)进行进一步的特征筛选。在构建的中文影评数据集上的实验结果表明,提出的情感分析模型较主流模型在性能上有明显提升,其在中文影评数据集上的准确率达89.67%,F 1达82.51%,通过消融实验进一步验证了模型性能的有效性。模型能够为制片方未来的电影制作和消费者的购票决策提供有用信息,具有一定的实用价值。 Fine-grained sentiment analysis is one of the key tasks in natural language processing.The existing mainstream solutions for sentiment analysis of Chinese film reviews generally use pre-training models such as Word2Vector to generate static word vectors,which can not solve the polysemy problem well.In addition,the use of CNN pooling to extract text features may lead to the loss of text information,resulting in insufficient learning,and fail to use the long-distance dependent information contained in the text and the syntactic information in the sentence.Therefore,a new sentiment analysis model RoBERTa-PWCN-GTRU was proposed.The model used RoBERTa pre-training model to generate dynamic text word vectors to solve the polysemy problem.In order to fully extract and utilize the text information,the improved network DenseDPCNN was used to capture the long-distance dependent information of the text,and the feature fusion was carried out with the global semantic information obtained by Bi-LSTM in a dual-channel way,and then the sentence syntax information obtained by the proximity-weighted convolutional network(PWCN)was integrated.Gated Tanh-Relu unit(GTRU)was introduced for further feature screening.The experimental results on the constructed Chinese film review dataset show that the proposed sentiment analysis model has significantly improved performance compared with the mainstream model,with the accuracy of 89.67%and F 1 value of 82.51%on the Chinese film review dataset.The ablation experiment further verifies the effectiveness of the model performance.The model can provide useful information for the producers'future film production and consumers'decision of buying tickets,and has certain practical value.
作者 陈钰佳 郑更生 肖伟 CHEN Yu-jia;ZHENG Geng-sheng;XIAO Wei(School of Computer Science and Engineering,Wuhan Institute of Technology,Wuhan 430205,China;Key Laboratory of Intelligent Robot in Hubei Province,Wuhan Institute of Technology,Wuhan 430205,China)
出处 《科学技术与工程》 北大核心 2023年第18期7844-7851,共8页 Science Technology and Engineering
基金 国家自然科学基金青年基金(62106179)。
关键词 中文影评 情感分析 RoBERTa预训练模型 邻近加权卷积 门控Tanh-Relu单元 Chinese film reviews sentiment analysis RoBERTa pre-training model proximity-weighted convolution gated Tanh-Relu unit
  • 相关文献

参考文献12

二级参考文献69

共引文献127

同被引文献24

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部