摘要
【目的】为反制社交网络中的恶意信息、引导正确的舆论走向,提出一种时序序列生成式对抗网络(T-SeqGAN),实现网评贴文自动生成。【方法】通过将序列生成式对抗网络(SeqGAN)的生成器修改为Seq2Seq结构,分别以双向门控循环单元和时序卷积神经网络(TCN)作为其编码器与解码器的骨架网络的方式,提高生成贴文与真实网评贴文的语序结构及语义特征的相似性;通过将SeqGAN的判别器修改为TCN与注意力机制层相结合的模型的方式,提高生成贴文的语句通顺度。【结果】与基线模型相比,利用TSeqGAN生成的网评贴文BLEU-2(0.79935)、BLEU-3(0.60396)、BLEU-4(0.47642)、KenLM(-27.67029)指标值更高,PPL(0.75247)指标值更低。【局限】生成贴文的词汇量及语言风格受制于已有的真实贴文,网评贴文自动生成方法的适用情景受限。【结论】本文模型生成的网评贴文具有更高的语序正确性和语法正确性,与真实贴文的内容相似性也更高,能够在社交网络中引导正确的舆论走向。
[Objective]This paper proposes a Temporal Sequence Generative Adversarial Network(T-SeqGAN)automatically generating online comments,aiming to counteract malicious information on social networks and guide the correct direction of public opinion.[Methods]First,we modified the Sequence Generative Adversarial Network(SeqGAN)generator to a Seq2Seq structure.Then,we used the bidirectional gated recurrent unit(BiGRU)and the sequential convolutional neural network(TCN)as the skeleton network of the encoder and decoder,respectively.Next,we improved the similarity of the syntactic structure and semantic features between the generated posts and the real online comments.Finally,we modified the discriminator of SeqGAN to a model combing TCN and attention mechanism layers to improve the fluency of generated posts.[Results]Compared with the baseline model,the comments generated by the proposed model have significantly higher BLEU-2(0.79935),BLEU-3(0.60396),BLEU-4(0.47642),and KenLM(-27.67029)metrics,as well as lower PPL(0.75247)metrics.[Limitations]The vocabulary and language style of the generated posts are limited by actual posts,and the applicability of our method is limited.[Conclusions]The comments generated by the proposed model have higher syntactic and grammatical correctness and higher similarity to the real-world ones,which can guide the correct direction of public opinion on social networks.
作者
刘欣然
徐雅斌
李继先
Liu Xinran;Xu Yabin;Li Jixian(Beijing Key Laboratory of Network Culture and Digital Communication,Beijing University of Information Science and Technology,Beijing 100101,China;School of Computer Science,Beijing University of Information Science and Technology,Beijing 100101,China;School of Humanities and Education,Beijing Open University,Beijing 100081,China)
出处
《数据分析与知识发现》
CSCD
北大核心
2023年第4期101-113,共13页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金项目(项目编号:61672101)
网络文化与数字传播北京市重点实验室开放课题(项目编号:ICCD XN004)
信息网络安全公安部重点实验室开放课题(项目编号:C18601)的研究成果之一。