期刊文献+

基于Word2vec的自然语言隐写分析方法 被引量:6

Natural Language Steganalysis Method Based on Word2vec
下载PDF
导出
摘要 为数字化表示文本内容的语义信息,并提高基于同义词替换的隐写文本检测精度,提出一种新的自然语言隐写分析方法。利用Word2vec对大规模语料库进行训练获得包含丰富语义信息的多维词向量,使用同义词及其上下文词向量之间的余弦距离度量2个词之间的相关度,并计算同义词在特定上下文中的合适度。根据信息嵌入过程中同义词替换操作对文本同义词合适度的影响提取检测特征形成特征向量,采用贝叶斯分类模型训练特征向量得到隐写分析特征,从而识别隐写文本。实验结果表明,该方法对于不同嵌入率下隐写文本的平均检测精确率和召回率分别达到97.71%和92.64%,具有较好的检测性能。 In order to represent the semantic information of the text content for digitization and improve the accuracy of detecting stego texts based on synonym substitution,a novel natural language steganalyisis method is proposed.Word2vec is employed to train a large-scale corpus to obtain multi-dimensional word vectors which contains rich semantic information.Then,it uses the cosine distance between a synonym and its context word vector to measure the correlation between two words,and calculates the fitness of synonyms in a specific context.According to the effect on the context fitness of the synonyms caused by the synonym substitutions in the embedding process,detection features are extracted to form a feature vector,and the Bayesian classification model is employed to train feature vector for the task of steganalysis feature to detect the stego texts.Experimental results show that the proposed method has good detection performance,whose average detection precision and average recall for the stego texts with different embedding rates achieve 97.71% and 92.64%,respectively.
作者 喻靖民 向凌云 曾道建 YU Jingmin;XIANG Lingyun;ZENG Daojian(Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation,Changsha University of Science and Technology,Changsha 410114,China;School of Computer and Communication Engineering,Changsha University of Science and Technology,Changsha 410114,China;Hunan Provincial Key Laboratory of Smart Roadway and Cooperative Vehicle-Infrastructure Systems,Changsha University of Science and Technology,Changsha 410114,China)
出处 《计算机工程》 CAS CSCD 北大核心 2019年第3期309-314,共6页 Computer Engineering
基金 国家自然科学基金(61202439 61602059) 湖南省教育厅科学研究重点项目(16A008)
关键词 自然语言 词向量 同义词替换 隐写分析 上下文合适度 natural language word vector synonym substitution steganalysis context fitness
  • 相关文献

参考文献5

二级参考文献36

  • 1周继军,杨著,钮心忻,杨义先.文本信息隐藏检测算法研究[J].通信学报,2004,25(12):97-101. 被引量:26
  • 2车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:115
  • 3朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:325
  • 4袁树雄,孙星明.英文文本多重数字水印算法设计与实现[J].计算机工程,2006,32(15):146-148. 被引量:6
  • 5董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量:55
  • 6罗纲 孙星明.基于噪声检测的文本隐藏信息检测算法研究.湖南大学学报:自然科学版,2005,32(6):181-184.
  • 7Cox I J, Kalker T, Pakura G, et al. Information transmission and steganography [C] //Proc of the 4th Int Workshop on Digital Watermarking. LNCS 3710. Berlin: Springer, 2005:15-29
  • 8Lie Wennung, Lin Guoshiang. A feature -based classification technique for blind image steganalysis[J]. IEEE Trans on Multimedia, 2005, 7(6):1007-1020
  • 9Lyu S, Farid H. Steganalysis using higher order image statistics [J]. IEEE Trans on Information Forensics and Security, 2006, 1(1): 111-119
  • 10Taskiran C, Topkara U, Topkara M, et al. Attacks on lexical natural language steganography systems [C] //Proc of the SPIE Int Conf on Security, Steganograpby, and Watermarking of Multimedia Contents VIII. San Jose: SPIE, 2006:607209

共引文献148

同被引文献35

引证文献6

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部