期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Nonlinear Time-Frequency Distributions of Spectrum Energy Operator in Large Vocabulary Mandarin Speaker Independent Speech Recognition System 被引量:1
1
作者 FadhilH.T.Al-dulaimy 王作英 《Tsinghua Science and Technology》 SCIE EI CAS 2003年第6期667-671,共5页
This work demonstrates the use of the nonlinear time-frequency distribution (NLTFD) of a discrete time energy operator (DTEO) based on amplitude modulation-frequency modulation demodulation techniques as a feature i... This work demonstrates the use of the nonlinear time-frequency distribution (NLTFD) of a discrete time energy operator (DTEO) based on amplitude modulation-frequency modulation demodulation techniques as a feature in speech recognition. The duration distribution based hidden Markov module in a speaker independent large vocabulary mandarin speech recognition system was reconstructed from the feature vectors in the front-end detection stage. The goal was to improve the performance of the existing system by combining new features to the baseline feature vector. This paper also deals with errors associated with using a pre-emphasis filter in the front end processing of the present scheme, which causes an increase in the noise energy at high frequencies above 4 kHz and in some cases degrades the recognition accuracy. The experimental results show that eliminating the pre-emphasis filters from the pre-processing stage and using NLTFD with compensated DTEO combined with Mel frequency cepstrum components give a 21.95% reduction in the relative error rate compared to the conventional technique with 25 candidates used in the test. 展开更多
关键词 large vocabulary speech recognition duration distribution based hidden Markov module robust feature energy operator
原文传递
Adaptive Compensation Algorithm in Open Vocabulary Mandarin Speaker-Independent Speech Recognition
2
作者 FadhilH.T.Al-dulaimy 王作英 田野 《Tsinghua Science and Technology》 SCIE EI CAS 2002年第5期521-526,共6页
In speech recognition systems, the physiological characteristics of the speech production model cause the voiced sections of the speech signal to have an attenuation of approximately 20 dB per decade. Many speech rec... In speech recognition systems, the physiological characteristics of the speech production model cause the voiced sections of the speech signal to have an attenuation of approximately 20 dB per decade. Many speech recognition algorithms have been developed to solve this problem by filtering the input signal with a single-zero high pass filter. Unfortunately, this technique increases the noise energy at high frequencies above 4 kHz, which in some cases degrades the recognition accuracy. This paper solves the problem using a pre-emphasis filter in the front end of the recognizer. The aim is to develop a modified parameterization approach taking into account the whole energy zone in the spectrum to improve the performance of the existing baseline recognition system in the acoustic phase. The results show that a large vocabulary speaker-independent continuous speech recognition system using this approach has a greatly improved recognition rate. 展开更多
关键词 mel-frequency cepstrum coefficients speech recognition duration distribution based hidden Markov model
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部