期刊文献+

融合CNN和Transformer编码器的变声语音鉴别与还原 被引量:1

Identification and restoration of transformed voice by fusing CNN and Transformer encoder
下载PDF
导出
摘要 语音变声伪装会导致人耳感知和声纹识别出现错误,从而达到隐匿说话人真实身份的目的。为削弱变声语音的影响,提出一种融合卷积神经网络(Convolutional Neural Networks,CNN)和Transformer编码器的模型,提取变声语音的局部特征和全局特征用于判别变声因子,并根据变声因子的数值实施变声语音还原。在中英文真实场景录音数据集上验证了所提方法的有效性,对变声因子判别实现了95%以上的准确率。利用所提出的方法,在黑箱条件下对某型商用硬件变声器输出的语音进行鉴别与还原,取得了较好的效果。 Voice transformation will lead to errors in auditory perception and speaker recognition so as to conceal the speaker′s real identity.In order to reduce the negative impact of transformed voice,a model fusing Convolutional Neural Networks(CNN)and Transformer encoder was proposed in this paper,which extracted local and global features of transformed voice to predict the disguise factor,and restored the original voice according to the value of the disguise factor.The validity of the proposed method was verified on datasets of both Chinese and English recorded in real-world scenes,where the accuracy was higher than 95%.Under the condition of black box,the proposed method had good performance when identifying and restoring the output voice of a commercial hardware of voice changer.
作者 魏春雨 孙蒙 刘伟 张星昱 Wei Chunyu;Sun Meng;Liu Wei;Zhang Xingyu(College of Command and Control Engineering,Army Engineering University of PLA,Nanjing 210007,China)
出处 《信息技术与网络安全》 2022年第1期47-54,共8页 Information Technology and Network Security
基金 江苏省优秀青年基金(BK20180080)。
关键词 基频变声 语音鉴伪 变声还原 时频特征 pitch scaling voice anti-disguise voice restoration time-frequency features
  • 相关文献

参考文献2

二级参考文献12

共引文献32

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部