期刊文献+

基于对抗网络的声纹识别域迁移算法

GAN-Based Domain Adaptation Algorithm for Speaker Verification
下载PDF
导出
摘要 针对声纹识别任务中常常出现的由于真实场景语音与模型训练语料在内部特征(情感、语言、说话风格、年龄)或外部特征(背景噪声、传输信号、麦克风、室内混响)等方面的差异所导致的模型识别率低的问题,提出了一种基于对抗网络的声纹识别域迁移算法。首先,利用源域语音对X-Vector的声纹识别模型进行训练;然后,采用域迁移方法将源域训练的XVector模型迁移至目标域训练数据;最后,在目标域测试数据上检测迁移后的模型性能,并将其与迁移前的模型性能进行对比。实验中采用AISHELL1作为源域,采用VoxCeleb1和CNCeleb分别作为目标域对算法性能进行测试。实验结果表明,采用本文方法进行迁移后,在VoxCeleb1和CN-Celeb的目标域测试集上的等错误率分别下降了21.46%和19.24%。 A key problem in speaker verification task is the condition mismatch between the training data and the testing data,which may significantly affect the verification performance.In most of the speaker recognition application scenarios,it is usually impossible to obtain enough samples to retrain the speaker recognition model.At the same time,the samples that is used to train the original model usually may be quite different from those obtained in real applications due to the variability caused by the intrinsic factors(e.g.,the changes in emotion,language,vocal effect,speaking style,and aging,etc.)or extrinsic ones(e.g.,background noise,transmission channel,microphone,room acoustics,and distance from the microphone,etc.).In this paper,an adversarial domain adaptation strategy is designed and applied to the X-Vector-based speaker verification scheme to enhance its domain adaptation ability.First,the X-Vector scheme is trained on the source dataset(AISHELL1).Then,the domain adaptation strategy is applied to the obtained X-Vector scheme for enabling it adapt to the target dataset(VoxCeleb1 or CN-Celeb).Finally,the performances of the X-Vector schemes obtained before and after adaptation are compared via the target dataset,from which it is demonstrated that the proposed adaptation strategy achieves 21.46%and 19.24%Equal Error Rate(EER)reduction on VoxCeleb1 and CN-Celeb dataset,respectively.
作者 季敏飞 陈宁 JI Minfei;CHEN Ning(School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
出处 《华东理工大学学报(自然科学版)》 CAS CSCD 北大核心 2022年第2期231-236,共6页 Journal of East China University of Science and Technology
基金 国家自然科学基金面上项目(61771196)。
关键词 声纹识别 迁移学习 对抗网络 speaker verification domain adaptation adversarial network
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部