摘要
基于辨识向量(i-vector)的说话人感知训练方法使用MFCC作为输入特征对i-vector进行提取,但MFCC较差的特征鲁棒性会影响该训练方法的识别性能。为此,提出一种基于改进i-vector的说话人感知训练方法。设计基于SVD的低维特征提取方法,用其提取的特征替代MFCC对表征能力更优的i-vector进行提取。实验结果表明,在捷克语语料库中,相对于DNN-HMM语音识别系统与原始基于i-vector的说话人感知训练方法,该方法的识别性能分别提升了1.62%与1.52%,在WSJ语料库中,该方法识别性能分别提升了3.9%和1.48%。
The performance of speaker aware training method based on i-vector is poor because of using MFCC which has the relative poor robustness as the input feature for the extraction of the i-vector. To solve this problem, an improved i-vector based speaker aware training method is proposed. Firstly,a low dimensional feature extraction method based on SVD is proposed, and then the feature extracted by this method is used to replace the MFCC,which can extract better ivector.Experimental results show that,in the Vystadial_cz corpus,compared with the DNN-HMM speech recognition system and the original i-vector based speaker aware training method,the recognition performance of this method is increased by 1. 62% and 1. 52% respectively,in the WSJ corpus,the recognition performance of this method is increased by 3. 9% and 1. 48% respectively.
作者
梁玉龙
屈丹
邱泽宇
LIANG Yulong;QU Dan;QIU Zeyu(School of Information and Systems Engineering, PLA Information Engineering University ,Zhengzhou 450002, Chin)
出处
《计算机工程》
CAS
CSCD
北大核心
2018年第5期262-267,共6页
Computer Engineering
基金
国家自然科学基金(61673395
61403415)
河南省自然科学基金(162300410331)
关键词
说话人感知训练
辨识向量
深度神经网络
奇异值矩阵分解
瓶颈特征
speaker aware training
i-vector
Deep Neural Network (DNN)
Singular Value Matrix Decomposition(SVMD)
bottleneck feature