期刊文献+

基于改进时延神经网络的合成语音检测

Synthetic Speech Detection Based on Improved Time Delay Neural Network
下载PDF
导出
摘要 在可变内核机制的时延神经网络基础上,提出一种带有全局多尺度注意力机制的神经网络结构和基于Fbank和翻转梅尔频率倒谱系数(Inversed Mel-Frequency Cepstral Coefficients,IMFCC)的融合特征。在ASVspoof 2019 LA数据集上,采用等错误率和测试集准确率作为评价指标。实验结果表明,使用提出的含全局多尺度注意力机制的神经网络结构,在相同声学特征的情况下,识别准确率比ECAPA-TDNN和SKA-TDNN分别提高5.1%和4.3%。 In this paper,a neural network architecture with global multi-scale attention mechanism and a fusion feature based on Fbank and Inversed Mel-Frequency Cepstral Coefficients(IMFCC)are proposed on the basis of variable kernel mechanism time delay neural network.The equal error rate and the accuracy of test set were used as the evaluation index on the ASVspoof 2019 LA data set.The experimental results show that the proposed neural network structure with global multi-scale attention mechanism can improve the recognition accuracy by 5.1%and 4.3%compared with ECAPA-TDNN and SKA-TDNN,respectively,under the same acoustic features.
作者 王志翼 张红兵 WANG Zhiyi;ZHANG Hongbing(School of Public Security Information Technology,Criminal Investigation Police University of China,Shenyang 110035,China)
出处 《电声技术》 2023年第9期118-120,共3页 Audio Engineering
基金 2023年中央高校基本科研业务费重大项目培育计划(JYTZD2023150)。
关键词 时延神经网络 合成语音 特征融合 time-delay neural network synthetic speech feature fusion

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部