This paper presents a method of tone recognition for Mandarin speech by using combination of wavelet transform and hidden Markov modeling techniques. A pitch detector based on singularity detection and multi-resolutio...This paper presents a method of tone recognition for Mandarin speech by using combination of wavelet transform and hidden Markov modeling techniques. A pitch detector based on singularity detection and multi-resolution analysis of wavelet transform is employed for estimation of pitch periods, and hidden Markov modeling with partition Gaussian mixtures probability density function is used for the tone recognition. The algorithm can provide recognition accuracy of 97.22% and 94.47% for speaker-dependent and speaker-independent tone recognition, respectively.展开更多
Emotion mismatch between training and testing will cause system performance decline sharply which is emotional speaker recognition. It is an important idea to solve this problem according to the emotion normalization ...Emotion mismatch between training and testing will cause system performance decline sharply which is emotional speaker recognition. It is an important idea to solve this problem according to the emotion normalization of test speech. This method proceeds from analysis of the differences between every kind of emotional speech and neutral speech. Besides, it takes the baseband mismatch of emotional changes as the main line. At the same time, it gives the corresponding algorithm according to four technical points which are emotional expansion, emotional shield, emotional normalization and score compensation. Compared with the traditional GMM-UBM method, the recognition rate in MASC corpus and EPST corpus was increased by 3.80% and 8.81% respectively.展开更多
基金Supported by the National Natural Science Foundatiuon of China
文摘This paper presents a method of tone recognition for Mandarin speech by using combination of wavelet transform and hidden Markov modeling techniques. A pitch detector based on singularity detection and multi-resolution analysis of wavelet transform is employed for estimation of pitch periods, and hidden Markov modeling with partition Gaussian mixtures probability density function is used for the tone recognition. The algorithm can provide recognition accuracy of 97.22% and 94.47% for speaker-dependent and speaker-independent tone recognition, respectively.
文摘Emotion mismatch between training and testing will cause system performance decline sharply which is emotional speaker recognition. It is an important idea to solve this problem according to the emotion normalization of test speech. This method proceeds from analysis of the differences between every kind of emotional speech and neutral speech. Besides, it takes the baseband mismatch of emotional changes as the main line. At the same time, it gives the corresponding algorithm according to four technical points which are emotional expansion, emotional shield, emotional normalization and score compensation. Compared with the traditional GMM-UBM method, the recognition rate in MASC corpus and EPST corpus was increased by 3.80% and 8.81% respectively.