Mandarin Chinese tone patterns vary in one of the four ways, i.e, (1) high level; (2) rising; (3) low falling and rising; and (4) high falling. The present study is to examine the efficacy of an artificial neural netw...Mandarin Chinese tone patterns vary in one of the four ways, i.e, (1) high level; (2) rising; (3) low falling and rising; and (4) high falling. The present study is to examine the efficacy of an artificial neural network in recognizing these tone patterns. Speech data were recorded from 12 children (3-6 years of age) and 15 adults. All subjects were native Mandarin Chinese speakers. The fundamental frequencies (F0) of each monosyllabic word of the speech data were extracted with an autocorrelation method. The pitch data(i.e., the F0 contours) were the inputs to a feed-forward backpropagation artificial neural network. The number of inputs to the neural network varied from 1 to 16 and the hidden layer of the network contained neurons that varied from 1 to 16 in number. The output of the network consisted of four neurons representing the four tone patterns of Mandarin Chinese. After being trained with the Levenberg-Marquardt optimization, the neural network was able to successfully classify the tone patterns with an accuracy of about 90% correct for speech samples from both adults and children. The artificial neural network may provide an objective and effective way of assessing tone production in prelingually-deafened children who have received cochlear implants.展开更多
This paper presents a reliable speaker-independent method of recognizing Chinese tones. An unbiased center-clipping autocorrelation algorithm of pitch period extraction is proposed. A two-dimensional decision vector i...This paper presents a reliable speaker-independent method of recognizing Chinese tones. An unbiased center-clipping autocorrelation algorithm of pitch period extraction is proposed. A two-dimensional decision vector is used for recognizing Chinese tones by passing the pitch period sequence through the procedures of data selection, error correction, data smoothing and curve fitting. The average correct rate of tone recognition for isolated Chinese syllables is over 98%.展开更多
In this paper ,a new approach of pattern recognition for tone classification of Putonghua Which is important for speech recognition of Putonghua is discribed . In this method , four parameters of the fundamental frequ...In this paper ,a new approach of pattern recognition for tone classification of Putonghua Which is important for speech recognition of Putonghua is discribed . In this method , four parameters of the fundamental frequency trajectory are selected based on a large number of statistical experiments . It is assumed that the four parameters satisfy multidimensional Gaussion distribution and a non-Euclidean distance function for each tone class is derived according to the rule of minimum probability of calssification error . the optimal decision results are obtained in a sense of statistics . It is proved that this method provides very satisfactory results by the experiments for speaker-independent tone classification of Putonghua .展开更多
To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project Fo(fundamen...To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project Fo(fundamental frequency) features of neighboring syllables as compensations, and adds them to the original Fo features of the current syUable. The transforms are discriminatively trained by using an objective function termed as "minimum tone error", which is a smooth approximation of tone recognition accuracy. Experiments show that the new tonal features achieve 3.82% tone recognition rate improvement, compared with the baseline, using maximum likelihood trained HMM on the normal F0 features. Further experiments show that discriminative HMM training on the new features is 8.78% better than the baseline.展开更多
文摘Mandarin Chinese tone patterns vary in one of the four ways, i.e, (1) high level; (2) rising; (3) low falling and rising; and (4) high falling. The present study is to examine the efficacy of an artificial neural network in recognizing these tone patterns. Speech data were recorded from 12 children (3-6 years of age) and 15 adults. All subjects were native Mandarin Chinese speakers. The fundamental frequencies (F0) of each monosyllabic word of the speech data were extracted with an autocorrelation method. The pitch data(i.e., the F0 contours) were the inputs to a feed-forward backpropagation artificial neural network. The number of inputs to the neural network varied from 1 to 16 and the hidden layer of the network contained neurons that varied from 1 to 16 in number. The output of the network consisted of four neurons representing the four tone patterns of Mandarin Chinese. After being trained with the Levenberg-Marquardt optimization, the neural network was able to successfully classify the tone patterns with an accuracy of about 90% correct for speech samples from both adults and children. The artificial neural network may provide an objective and effective way of assessing tone production in prelingually-deafened children who have received cochlear implants.
基金The Project is Supported by the National Natural Science Foundation of China
文摘This paper presents a reliable speaker-independent method of recognizing Chinese tones. An unbiased center-clipping autocorrelation algorithm of pitch period extraction is proposed. A two-dimensional decision vector is used for recognizing Chinese tones by passing the pitch period sequence through the procedures of data selection, error correction, data smoothing and curve fitting. The average correct rate of tone recognition for isolated Chinese syllables is over 98%.
文摘In this paper ,a new approach of pattern recognition for tone classification of Putonghua Which is important for speech recognition of Putonghua is discribed . In this method , four parameters of the fundamental frequency trajectory are selected based on a large number of statistical experiments . It is assumed that the four parameters satisfy multidimensional Gaussion distribution and a non-Euclidean distance function for each tone class is derived according to the rule of minimum probability of calssification error . the optimal decision results are obtained in a sense of statistics . It is proved that this method provides very satisfactory results by the experiments for speaker-independent tone classification of Putonghua .
文摘To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project Fo(fundamental frequency) features of neighboring syllables as compensations, and adds them to the original Fo features of the current syUable. The transforms are discriminatively trained by using an objective function termed as "minimum tone error", which is a smooth approximation of tone recognition accuracy. Experiments show that the new tonal features achieve 3.82% tone recognition rate improvement, compared with the baseline, using maximum likelihood trained HMM on the normal F0 features. Further experiments show that discriminative HMM training on the new features is 8.78% better than the baseline.