摘要
微博立场检测是判断一段微博文本针对某一目标话题所表达的观点态度是支持、中立或反对.随着社交媒体的发展,从海量的微博数据中挖掘其蕴含的立场信息成为一项重要的研究课题.但是现有的方法往往将其视作情感分类任务,没有对目标话题和微博文本之间的关系特征进行分析,在基于深度学习的分类框架上,扩展并提出了基于Bert-Condition-CNN的立场检测模型,首先为提高话题在文本中的覆盖率,对微博文本进行了主题短语的提取构成话题集;然后使用Bert预训练模型获取文本的句向量,并通过构建话题集和微博文本句向量之间的关系矩阵Condition层来体现两个文本序列的关系特征;最后使用CNN对Condition层进行特征提取,分析不同话题对立场信息的影响并实现对立场标签的预测.该模型在自然语言处理与中文计算会议(NLPCC2016)的数据集中取得了较好的效果,通过主题短语扩展后的Condition层有效地提升了立场检测的准确度.
Stance detection task aims to automatically determine whether a Weibo text is in favor of the given target,against the given target, or neither. Mining the stance information about a given target is an emerging problem. Based on the success of deep learning in classifying, this study proposed a Bert-Condition-CNN model to predict the stance label.Firstly, noted that the given target may not be present in the Weibo text, so we extracted the topic phrases from Weibo corpus as the given target supplement. Then, we used Bert language model to accept the text representation vector and calculated a Condition matrix whose entries represent the relationship between Weibo text and topic phrases. Finally, a convolutional neural network was utilized to capture the stance features from Condition matrix. Experimental results on NLPCC2016 datasets demonstrate the model has achieved a sound effect of stance detection.
作者
王安君
黄凯凯
陆黎明
WANG An-Jun;HUANG Kai-Kai;LU Li-Ming(College of Information,Mechanical and Electrical Engineering,Shanghai Normal University,Shanghai 201400,China)
出处
《计算机系统应用》
2019年第11期45-53,共9页
Computer Systems & Applications
关键词
立场检测
主题短语
关系矩阵
句向量
stance detection
topic phrase
condition matrix
text representation