摘要
近年来,新浪微博、推特等社交网络平台逐渐成为反映社会舆情的主要载体之一,为网民发表观点和表达情绪提供了便利。基于社交网络大数据的舆情监控已经成为新的研究热点,利用各国的社交网络大数据进行民众情感监测,有助于直接掌握国际关系中的民众情感倾向,对我国外交、对外贸易等方面都有很重要的作用。基于此,提出了一种面向中日语料的民众情感监测系统,该系统能够同时分析新浪微博和推特等社交平台的中日文语料数据中包含的情感倾向,并以可视化的形式展现给用户。情感分析算法方面,在BERT模型基础上结合自扩展的中日文情感词典,提出了一个新的情感分析模型——EmoBERT。实验结果表明,相比于原始BERT模型,EmoBERT模型在中文情感分类任务和日文情感分类任务上都取得了很好的表现。其中中文模型EmoBERT-C将中文BERT模型准确率从89.68%提升至92.15%,日文模型EmoBERT-J将日文BERT模型准确率从74.73%提升至78.26%。
In recent years,social networking platforms such as Sina Weibo and Twitter have gradually become one of the main carriers for reflecting social public opinion,providing a convenient platform for netizens to express their opinions and emotions.Public opinion monitoring based on social network big data has become a new research hotspot.People’s emotions monitoring using social network big data in various countries is helpful to directly grasp people’s emotional tendencies in international relations,and has a great impact on the diplomacy,foreign trade,and other aspects.Based on this,a public sentiment monitoring system for Chinese and Japanese data was proposed,which could analyze the emotional tendencies contained in Chinese and Japanese data on social platforms such as Sina Weibo and Twitter simultaneously,and displayed them to users in a visual form.In the aspect of sentiment analysis algorithm,based on the BERT model and combined with the self-expanding Chinese and Japanese sentiment lexicon,a new sentiment analysis model,EmoBERT,was proposed.The experimental results show that,compared with the original BERT model,the EmoBERT has achieved good results on both Chinese sentiment classification tasks and Japanese sentiment classification tasks.Among them,EmoBERT-C increases the accuracy of Chinese BERT from 89.68%to 92.15%,and EmoBERT-J increases the accuracy of Japanese BERT model from 74.73%to 78.26%.
作者
李爱黎
张子帅
林荫
王秋菊
杨建安
孟炜程
张岩峰
LI Aili;ZHANG Zishuai;LIN Yin;WANG Qiuju;YANG Jianan;MENG Weicheng;ZHANG Yanfeng(School of Computer Science and Engineering,Northeastern University,Shenyang 110000,China;Foreign Studies College,Northeastern University,Shenyang 110000,China)
出处
《大数据》
2022年第6期105-126,共22页
Big Data Research
基金
国家自然科学基金资助项目(No.62072082)
辽宁省重点研发计划(No.2020JH2/10100037)
中央高校基本科研业务费(No.N2216015)。
关键词
情感分析
舆情监测
情感词典
中日关系
微博
推特
sentiment analysis
public opinion monitoring
sentiment lexicon
Chinese-Japanese relation
Weibo
Twitter