摘要
无文本说话人确认模型通过复杂的网络结构和多变的特征提取方式来获得必要的性能,然而这会产生巨大的内存消耗和递增的计算成本,导致模型难以在资源有限的硬件设施上部署。针对该问题,利用虚拟教师蒸馏模型(teacher-free knowledge distillation,Tf-KD)可以带来百分之百的分类正确率、平滑的输出概率分布的优势,在轻量级残差网络的基础上构建虚拟教师说话人确认模型(teacher-free speaker verification model,Tf-SV)。同时引入空间共享而通道分离的动态激活函数和附加角裕度损失函数,使所提模型在特征表达、训练效率以及模型压缩后性能等方面的水平得到极大提升,最终达到无文本说话人确认模型能够在存储或者计算资源有限设备上部署的目的。基于VoxCeleb1数据集的实验表明,虚拟教师说话人确认模型的等错误率(EER)降低到3.4%。与已有成果相比,指标有明显提升,证明了在说话人确认任务上所提压缩模型的有效性。
The text-independent speaker verification models achieve powerful performance through complex network structure and changeable feature extraction methods, however, they need huge memory consumption and incremental computing costs, which makes it difficult to deploy the models on resource-limited hardware facilities. Focusing on this problem, this research takes advantage of the teacher-free knowledge distillation(Tf-KD)model, which can bring one hundred percent classification accuracy and smoothing output probability distribution to establish a teacher-free speaker verification(Tf-SV)model based on a lightweight residual network. At the same time, the spatial-shared and channel-wise dynamic rectified linear units function and the additive angular margin loss function(AAM-Softmax)are introduced, which greatly improve the performance of the proposed model in terms of feature expression, training efficiency and compressed model’s capabilities, and finally achieve the aim of deploying the given Tf-SV model on limited-storage or limited-computing facilities. Based on the VoxCeleb1 dataset, experimental results show that the equal error rate(EER)of the Tf-SV model is reduced to 3.4%. This is a significant improvement over the published results, and demonstrates the effectiveness of the compression model on the speaker verification task.
作者
肖金壮
李瑞鹏
纪盟盟
XIAO Jinzhuang;LI Ruipeng;JI Mengmeng(College of Electronic Information Engineering,Hebei University,Baoding,Hebei 071000,China)
出处
《计算机工程与应用》
CSCD
北大核心
2022年第8期198-203,共6页
Computer Engineering and Applications
基金
河北省自然科学基金面上项目(H2016201201)
河北省高等学校科学技术研究重点项目(ZD2016149)。
关键词
虚拟教师知识蒸馏
动态激活函数
附加角裕度损失函数
模型压缩
说话人确认
teacher-free knowledge distillation
dynamic rectified linear units function
additive angular margin loss function
model compression
speaker verification