摘要
针对互联网虚假评论大肆横行,在虚假评论研究领域却没有完全公开的中文数据集可供中文虚假评论检测研究的问题,提出了一种基于生成对抗网络的中文虚假评论数据生成模型.首先,对生成器生成的文字序列通过蒙特卡洛搜索获取一批样本;然后,采用强化学习方法将判别器、分类器和重构器的反馈化为奖励分数;最后,传回生成器,对生成器进行参数优化,以生成贴近真实世界的具有相应类标签属性及特征的虚假评论数据.以BLEU值为评估指标,实验结果表明,所提出的模型在本文数据集上取得了更好的BLEU值,具有较好的生成效果.
In order to solve the problem that fake reviews are rampant on the Internet,but there is no fully open Chinese data set for Chinese fake reviews detection in the field of fake reviews research,a Chinese fake reviews data generation model based on generative adversarial network is proposed.Firstly,Monte Carlo search is used to obtain a batch of samples from the text sequence generated by the generator.Then,the feedback of discriminator,classifier and reconstructor is converted into reward scores by reinforcement learning.Finally,reward scores back to the generator,and the parameters of the generator are optimized to generate fake review data with corresponding class tag attributes and features close to the real world.The BLEU value is used as the evaluation index.Experimental results show that on the dataset of this paper,the proposed generative model achieves better BLEU values and achieves a high level of performance.
作者
吴正清
曹晖
WU Zheng-qing;CAO Hui(Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education,Northwest Minzu University,Lanzhou 730030,Gansu,China)
出处
《云南大学学报(自然科学版)》
CAS
CSCD
北大核心
2023年第5期1033-1042,共10页
Journal of Yunnan University(Natural Sciences Edition)
基金
国家自然科学基金(61633013)
中央高校基本科研业务费专项(31920230054).
关键词
虚假评论
生成对抗网络
文本生成
强化学习
fake reviews
adversarial generative network
text generation
reinforcement learning