摘要
讽刺是日常交际中一种常见的语用现象,能够丰富说话者的观点并间接地表达说话者的深层含义。讽刺检测任务的研究目标是挖掘目标语句的讽刺倾向。针对讽刺语境表达变化多样以及不同用户、不同主题下的讽刺含义各不相同等特征,构建融合用户嵌入与论坛主题嵌入的上下文语境讽刺检测模型。该模型借助ParagraphVector方法的序列学习能力对用户评论文档与论坛主题文档进行编码,从而获取目标分类句的用户讽刺特征与主题特征,并利用一个双向门控循环单元神经网络得到目标句的语句编码。在标准讽刺检测数据集上进行的实验结果表明,与传统Bag-of-Words、CNN等模型相比,该模型能够有效提取语句的上下文语境信息,具有较高的讽刺检测分类准确率。
Sarcasm is a common pragmatic phenomenon in daily communication that enriches the views of speakers and indirectly expresses the their deep meaning.The research goal of sarcasm detection task is to mine the sarcasm tendency of target sentences.As the contexts and expressions of sarcasm is diverse,and the meaning of sarcasm varies according to users and topics,this paper proposes a contextual sarcasm detection model fusing users’embedding and forum topic embedding.The model uses the sequence learning ability of ParagraphVector method to encode the documents of user comments and forum topics to obtain the satirical features of users and topic features of the target sentence.Then a Bi-directional-Gated Recurrent Unit(Bi-GRU)neural network is used to obtain the sentence code of the target sentence.Experimental results on the standard sarcasm detection dataset show that compared with traditional Bagof-Words,CNN and other models,this model can effectively extract the contextual information of sentences,and has a higher accuracy of sarcasm detection and classification.
作者
韩虎
赵启涛
孙天岳
刘国利
HAN Hu;ZHAO Qitao;SUN Tianyue;LIU Guoli(School of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China;Gansu Provincial Engineering Research Center for Artificial Intelligence and Graphic and Image Processing,Lanzhou 730070,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第1期66-71,共6页
Computer Engineering
基金
国家自然科学基金(61562057)
国家社会科学基金(17BXW071)
甘肃省科技计划项目(18JR3RA104)。