摘要
针对语音回放攻击场景下的LCNN说话人识别系统中存在的过拟合问题,提出一种基于AOF-LCNN的神经网络。设计了一个新的DNN结构分类器作为后端分类网络,将该DNN结构级联在LCNN网络之后,形成一套新的端到端网络结构。由于LCNN结构中的MFM结构可能是造成过拟合的原因,在DNN后端结构中采用LeakyReLU作为激活函数,以抵消MFM的过拟合影响。在ASVspoof 2017数据集上的结果表明,该算法在Dev数据集和Eval数据集上分别达到了3.59%和13.79%的等错误率(EER),相对LCNN系统的等错误率分别降低了2.12%和3.51%。该算法一定程度上解决了过拟合的问题,提高了系统的鲁棒性,同时降低了系统的等错误率,从而提高识别性能。
Aiming at the over-fitting problem in LCNN speaker recognition system in audio playback attack scenario, a neural network based on AOF-LCNN is proposed. A new DNN structure classifier is designed as the back-end classification network and it is cascaded after the LCNN network to form a new end-to-end network structure. Because the MFM structure in the LCNN structure may be the cause of over-fitting, LeakyReLU is used as the activation function in the DNN back-end structure to offset the over-fitting effect of MFM. The results on the ASVspoof 2017 dataset show that the proposed method achieves an EER of 3.59% on the Dev dataset, an EER of 13.79% on the Eval dataset. The EER of the proposed method compared to that of the LCNN system was reduced by 2.12% and 3.51%, respectively. The proposed method solves the over-fitting problem to some extent, improves the robustness of the system and reduces the equal error rate of the system, thus it improves the recognition performance.
作者
李波
蔡晓东
侯珍珍
陈思
LI Bo;CAI Xiaodong;HOU Zhenzhen;CHEN Si(School of Information and Communication,Guilin University of Electronic Technology,Guilin 541004,China)
出处
《桂林电子科技大学学报》
2020年第1期13-17,共5页
Journal of Guilin University of Electronic Technology
基金
新疆重点研发计划(2018B03022-1,2018B03022-2)
桂林电子科技大学研究生教育创新计划(2017YJCX29)。