摘要
舆论情感分析重点研究公众对于公共事件的情感偏向,其中涉及公共卫生事件的舆论会直接影响社会稳定,所以对于微博的情感分析尤为重要。该文采取有关疫情方面的文本数据集,使用RoBERTa和BiGRU以及双层Attention结合的RoBERTa-BDA(RoBERTa-BiGRU-Double Attention)模型作为整体结构。首先使用RoBERTa获取了蕴含文本上下文信息的词嵌入表示,其次使用BiGRU得到字符表示,然后使用注意力机制计算各个字符对于全局的影响,再使用BiGRU得到句子表示,最后使用Attention机制计算出每个字符对于其所在的句子的权重占比,得出全文的文本表示,并通过softmax函数对其进行情感分析。为了验证RoBERTa-BDA模型的有效性,设计三种实验,在不同词向量对比实验中,RoBERTa对比BERT中Macro F1和Micro F1值提高了0.42百分点和0.84百分点,在不同特征提取层模型对比实验中,BiGRU-Double Attention对比BiGRU-Attention提高了3.62百分点和1.34百分点,在跨平台对比实验中,RoBERTa-BDA在贴吧平台的Macro F1和Micro F1对比微博平台仅仅降低1.29百分点和2.88百分点。
Public opinion sentiment analysis focuses on studying the public’s emotional bias towards public events.Public opinions involving public health events will directly affect social stability,so sentiment analysis on Weibo is particularly important.We take text data sets related to the epidemic,and use RoBERTa,BiGRU,and the RoBERTa-BDA(RoBERTa-BiGRU-Double Attention)model combined with double-layer Attention as the overall structure.Firstly,RoBERTa is used to obtain word embedding representation of textual context information.Secondly,BiGRU is used to obtain the character representation,then the attention mechanism is used to calculate the global impact of each character,and then BiGRU is used to obtain the sentence representation.Finally,the Attention mechanism is used to calculate the weight ratio of each character to the sentence in which it is located,and the text representation of the full text is obtained,and the sentiment analysis is carried out through softmax function.In order to verify the effectiveness of the RoBERTa-BDA model,three experiments were designed.In the comparison experiment of different word vectors,the Macro F1 and Micro F1 values in RoBERTa compared with BERT increased by 0.42 percentage points and 0.84 percentage points,respectively,in different feature extraction layers.In the model comparison experiment,BiGRU-Double Attention increased by 3.62 percentage points and 1.34 percentage points compared to BiGRU-Attention.In the cross-platform comparison experiment,RoBERTa-BDA only decreased by 1.29 percentage points and 2.88 percentage points on the Tieba platform Macro F1 and Micro F1 compared to the Weibo platform.
作者
吴加辉
加云岗
王志晓
张九龙
闫文耀
高昂
车少鹏
WU Jia-hui;JIA Yun-gang;WANG Zhi-xiao;ZHANG Jiu-long;YAN Wen-yao;GAO Ang;CHE Shao-peng(School of Computer Science,Xi’an Polytechnic University,Xi’an 710600,China;School of Computer Science and Engineering,Xi’an University of Technology,Xi’an 710048,China;School of Xi’an Innovation,Yan’an University,Xi’an 710100,China;National Satellite Meteorological Centre,Beijing 100080,China;School of Journalism and Communication,Tsinghua University,Beijing 100084,China)
出处
《计算机技术与发展》
2024年第7期175-183,共9页
Computer Technology and Development
基金
教育部人文社会科学研究青年基金(16YJCZH109)
2022年陕西省科技计划项目之区域创新能力引导计划(2022QFY01-17)
智慧城市多模态场景感知关键技术研究以及应用(2023JH-RGZNGG-0011)。