针对混合语音情感识别中,传统识别方法不能充分考虑语种之间的差异性,导致分类准确率偏低的问题,提出了自编码器(autoencoder)与长短时记忆(Long Short Term Memory,LSTM)模型相结合的方法,通过提取MFCC,MEL Spectrogram Frequency,Chr...针对混合语音情感识别中,传统识别方法不能充分考虑语种之间的差异性,导致分类准确率偏低的问题,提出了自编码器(autoencoder)与长短时记忆(Long Short Term Memory,LSTM)模型相结合的方法,通过提取MFCC,MEL Spectrogram Frequency,Chroma三种特征获得180维特征。并利用自编码器获取一个更高维度、更深层次的500维特征,通过LSTM进行建模,提高语音情感分类的准确性。使用德语EMO-DB和中文CASIA语音库进行分类实验,研究表明,自编码器提取出的深度特征更适合混合语音情感分类。较传统分类方法,使用自编码器+LSTM进行分类,最优识别结果可提升7.5%。展开更多
This paper presents a new HMM/MLP hybrid network for speech recognition. By taking advantage of the discriminative training of MLP, the unreasonable model correctness assumption on the model correctness of the ML trai...This paper presents a new HMM/MLP hybrid network for speech recognition. By taking advantage of the discriminative training of MLP, the unreasonable model correctness assumption on the model correctness of the ML training in basic HMM can be overcome, and its discriminative ability and recognition performance can be improved. Experimental results demonstrate that the discriminative ability and recognition performance of HMM/MLP is apparently better than normal HMM.展开更多
文摘针对混合语音情感识别中,传统识别方法不能充分考虑语种之间的差异性,导致分类准确率偏低的问题,提出了自编码器(autoencoder)与长短时记忆(Long Short Term Memory,LSTM)模型相结合的方法,通过提取MFCC,MEL Spectrogram Frequency,Chroma三种特征获得180维特征。并利用自编码器获取一个更高维度、更深层次的500维特征,通过LSTM进行建模,提高语音情感分类的准确性。使用德语EMO-DB和中文CASIA语音库进行分类实验,研究表明,自编码器提取出的深度特征更适合混合语音情感分类。较传统分类方法,使用自编码器+LSTM进行分类,最优识别结果可提升7.5%。
文摘This paper presents a new HMM/MLP hybrid network for speech recognition. By taking advantage of the discriminative training of MLP, the unreasonable model correctness assumption on the model correctness of the ML training in basic HMM can be overcome, and its discriminative ability and recognition performance can be improved. Experimental results demonstrate that the discriminative ability and recognition performance of HMM/MLP is apparently better than normal HMM.