摘要
在基于神经网络的说话人特征提取模型中,不同的池化方法会影响声纹特征的聚合效果。与传统池化方法相比,一些结合注意力机制的池化方法表现出更强的特征聚合能力。基于此,提出一种分步池化的声纹特征聚合方法,并在公开数据集上进行实验。结果表明,所提方法能有效改善声纹特征的聚合效果,提高声纹识别的准确率。
In a speaker feature extraction model based on neural networks,different pooling methods can affect the aggregation effect of voiceprint features.Compared with traditional pooling methods,some pooling methods that combine attention mechanisms exhibit stronger feature aggregation capabilities.Based on this,a step-by-step pooling method for voiceprint feature aggregation is proposed,and experiments are conducted on publicly available datasets.The results show that the proposed method can effectively improve the aggregation effect of voiceprint features and enhance the accuracy of voiceprint recognition.
作者
冯坤
和椿皓
FENG Kun;HE Chunhao(Baoding Drainage Service Center,Baoding 071051,China;College of Electronic Information Engineering,Hebei University,Baoding 071000,China)
出处
《电声技术》
2024年第3期21-23,72,共4页
Audio Engineering
关键词
声纹识别
神经网络
池化
深度学习
voiceprint recognition
neural networks
pooling
deep learning