摘要
针对已有自然语言数字水印方法可替换词数量有限以及水印提取效率低的问题,提出了一种基于上下文词预测和窗口压缩编码的数字水印方法。该方法通过神经网络语言模型自动学习原始文本中每个词的上下文语义特征,预测每个词的候选词列表,从而扩充可用于嵌入水印信息的可替换词数量。同时,考虑到不同位置的候选词的替换对句子语义的影响存在差异,该方法以由多个词组成的窗口为单位来嵌入水印信息,并通过词替换前后句子间的相似度来优化水印嵌入时候选词的选择。在此基础上,提出了一种语义无关的窗口压缩编码方法,其根据窗口中词的字符信息对窗口进行水印编码,解决了提取水印信息时对词替换位置的原始上下文的依赖。实验结果表明,所提方法在具有较高嵌入容量和文本质量的前提下,大大提高了水印的提取效率。
To address the problems of limited number of substitutable words and low watermark extraction efficiency in the existing natural language digital watermarking methods,a creative method based on context word prediction and window compression coding was proposed.Firstly,the contextual semantic features of each word in the original text were automatically learned through a neural network language model,and then the candidate word set for each word was predicted,thus the number of substitutable words that could be utilized for carrying watermark information was expanded.Meanwhile,considering the difference of the semantic impact caused by the substitutions of candidate words at different positions,the watermark information was embedded into each window containing several words,and the selection of candidate words for watermark embedding was optimized by the similarity between sentences before and after performing word substitutions.Finally,a semantic-independent window compression coding method was proposed,which encoded each window as appointed watermark information in terms of the character information of words contained in the window.So that during watermark extraction,the dependence on the original context at the position of word substitution was eliminated.The experimental results show that the proposed method greatly improves the watermark extraction efficiency with high embedding capacity and text quality.
作者
向凌云
黄明豪
张晨凌
杨春芳
XIANG Lingyun;HUANG Minghao;ZHANG Chenling;YANG Chunfang(School of Computer and Communication Engineering,Changsha University of Science and Technology,Changsha 410114,China;Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation,Changsha University of Science and Technology,Changsha 410114,China;Henan Key Laboratory of Cyberspace Situation Awareness,Information Engineering University,Zhengzhou 450001,China)
出处
《通信学报》
EI
CSCD
北大核心
2024年第2期213-224,共12页
Journal on Communications
基金
国家自然科学基金资助项目(No.61972057,No.61872448)
湖南省自然科学基金资助项目(No.2022JJ30623)。
关键词
数字水印
词替换
词预测
水印编码
digital watermarking
word substitution
word prediction
watermarking coding