摘要
基于人工智能内容生成(AIGC)技术生成文本具有道德、法律的合规性风险,需要对生成文本内容的流通进行规范和监管,因此对AIGC生成文本版权保护的迫切需求随之出现。水印技术是目前使用最广泛的数字版权保护方式。提出了一种应用于生成式因果语言模型的生成文本的水印添加技术,采用事中水印嵌入的方式在文本生成过程中隐式地嵌入文本水印特征编码,相较于传统事后水印添加技术对生成文本质量影响小,具有低感知、透明、鲁棒等优点。实验结果表明,提出的水印嵌入策略具有较好的鲁棒性,经过用户一定程度的编辑后仍旧能有效检出文本嵌入水印。与原有生成策略进行对比,所提方法与现有模型耦合度低,无须调整原有模型结构、训练策略、部署方式,不增加原有生成过程计算成本。
Artificial intelligence generated content(AIGC)generated text itself carried moral and legal compliance risks,and the circulation of generated text content need to be regulated.Therefore,there was an urgent need for copyright protection of AIGC generated text.Watermarking technology was currently the most widely used method for digital copyright protection.A watermark embedding technology was proposed for generating text using generative causal language models.An in-process watermark embedding method was adopted,which implicitly embeded text watermark during the text generation process.Compared to traditional post-process watermark embedding technology,it had less impact on the quality of generated text and had advantages such as low perception,transparency,and robustness.The proposed method has low coupling with existing models and can eliminate the need to adjust the original model structure,training strategies,deployment methods,and increase the computational cost of the original generation process.Through experimental results,the proposed watermark embedding strategy has good robustness and can effectively detect text embedded watermarks even after a certain degree of editing by users.
作者
刘明录
郑彦
韩雪
袁向阳
邓超
LIU Minglu;ZHENG Yan;HAN Xue;YUAN Xiangyang;DENG Chao(China Mobile Research Institute,Beijing 100053,China)
出处
《电信科学》
2023年第9期32-42,共11页
Telecommunications Science