期刊文献+

说话人音频攻击与对抗技术研究综述

A review on speaker audio attack and defense technologies
下载PDF
导出
摘要 文中概括了说话人音频攻击与对抗技术的最新进展。由于说话人音频攻击已经成为语音应用安全的严重威胁,以WaveNet、Transformer和GAN三种模型在音频攻击技术中的应用作为节点,分别介绍以其为基础的音频攻击技术。音频对抗技术则以涵盖的攻击技术分为3类,分别是基础音频攻击、重放攻击和深度伪造攻击。系统地阐述了音频攻击与对抗技术的最新研究成果,并分析比较了各算法在不同条件下的优劣,同时还介绍了音频技术常用的数据集。最后结合该领域目前的研究现状,提出了说话人音频攻防对抗技术研究中亟待关注与研究的问题。 This study reviews recent advances in speaker audio attack and defense technologies.As speaker audio attacks have become serious threats to the security of voice applications,we focus on speaker audio attacks that target applications based on the three audio models,WaveNet,Transformer and GAN,and analyze the audio attack technologies based on them.We divide the audio defense technologies into three categories based on the attacks target:basic audio attacks,replay attacks and deep forgery attacks.We systematically expound the latest studies on speaker audio attack and defense technologies,analyze and compare the advantages and disadvantages of each algorithm under different conditions,and introduce the commonly used data sets of audio technologies.Finally,we provide certain issues that need urgent attention and research for speaker audio attack and defense technologies.
作者 孙知信 赵杰 王恩良 刘晨磊 范连成 刘畅 SUN Zhixin;ZHAO Jie;WANG Enliang;LIU Chenlei;FAN Liancheng;LIU Chang(Post Big Data Technology and Application Engineering Research Center of Jiangsu Province,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;Post Industry Technology Research and Development Center of the State Posts Bureau(Internet of Things Technology),Nanjing University of Posts and Telecommunications,Nanjing 210003,China;Key Lab of Broadband Wireless Communication and Sensor Network Technology,Ministry of Education,Nanjing University of Posts and Telecommunications,Nanjing 210003,China;Anhui Nanling County Post Development Center,Wuhu 241399,China)
出处 《南京邮电大学学报(自然科学版)》 北大核心 2024年第4期17-29,共13页 Journal of Nanjing University of Posts and Telecommunications:Natural Science Edition
基金 国家自然科学基金(61972208,62272239)资助项目。
关键词 说话人音频 音频伪造 音频鉴伪 音频数据集 深度学习 speaker audio audio forgery audio forensics audio datasets deep learning
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部