The inclusion of more potentially correct words in the candidate sets is important to improve the accuracy of Large Vocabulary Continuous Speech Recognition (LVCSR). A candidate expansion algorithm based on the Weig...The inclusion of more potentially correct words in the candidate sets is important to improve the accuracy of Large Vocabulary Continuous Speech Recognition (LVCSR). A candidate expansion algorithm based on the Weighted Syllable Confusion Matrix (WSCM) is proposed. First, WSCM is derived from a confusion network. Then, the reeognised candidates in the confusion network is used to conjeeture the most likely correct words based on WSCM, after which, the conjectured words are combined with the recognised candidates to produce an expanded candidate set. Finally, a combined model having mutual information and a trigram language model is used to rerank the candidates. The experiments on Mandarin film data show that an improvement of 9.57% in the character correction rate is obtained over the initial recognition performance on those light erroneous utterances.展开更多
In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language proc...In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language processing. The speaker independently continuous speech recognition experiments and the part-of-speech tagging experiments show that Markov family model has higher performance than hidden Markov model. The precision is enhanced from 94.642% to 96.214% in the part-of-speech tagging experiments, and the work rate is reduced by 11.9% in the speech recognition experiments with respect to HMM baseline system.展开更多
The design of acoustic models is of vital importance to build a reliable connection between acoustic wave-form and linguistic messages in terms of individual speech units. According to the characteristic of Chinese ph...The design of acoustic models is of vital importance to build a reliable connection between acoustic wave-form and linguistic messages in terms of individual speech units. According to the characteristic of Chinese phonemes, the base acoustic phoneme units set is decided and refined and a decision tree based state tying approach is explored. Since one of the advantages of top-down tying method is flexibility in maintaining a balance between model accuracy and complexity, relevant adjustments are conducted, such as the stopping criterion of decision tree node splitting, during which optimal thresholds are captured. Better results are achieved in improving acoustic modeling accuracy as well as minimizing the scale of the model to a trainable extent.展开更多
基金supported by the National Natural Science Foundation of China under Grants No.61005004,No.61175011,No.61171193the Next-Generation Broadband Wireless Mobile Communications Network Technology Key Project under Grant No.2011ZX03002-005-01+2 种基金the One Church,One Family,One Purpose(111Project)under Grant No.B08004the Key Project of Ministry of Science and Technology of China under Grant No.2012ZX-03002019-002the National High Techni-cal Research and Development Program of China(863Program)under Grant No.2011A-A01A205
文摘The inclusion of more potentially correct words in the candidate sets is important to improve the accuracy of Large Vocabulary Continuous Speech Recognition (LVCSR). A candidate expansion algorithm based on the Weighted Syllable Confusion Matrix (WSCM) is proposed. First, WSCM is derived from a confusion network. Then, the reeognised candidates in the confusion network is used to conjeeture the most likely correct words based on WSCM, after which, the conjectured words are combined with the recognised candidates to produce an expanded candidate set. Finally, a combined model having mutual information and a trigram language model is used to rerank the candidates. The experiments on Mandarin film data show that an improvement of 9.57% in the character correction rate is obtained over the initial recognition performance on those light erroneous utterances.
基金Project(60763001)supported by the National Natural Science Foundation of ChinaProjects(2009GZS0027,2010GZS0072)supported by the Natural Science Foundation of Jiangxi Province,China
文摘In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language processing. The speaker independently continuous speech recognition experiments and the part-of-speech tagging experiments show that Markov family model has higher performance than hidden Markov model. The precision is enhanced from 94.642% to 96.214% in the part-of-speech tagging experiments, and the work rate is reduced by 11.9% in the speech recognition experiments with respect to HMM baseline system.
基金Project 60475007 supported by the National Natural Science Foundation of China
文摘The design of acoustic models is of vital importance to build a reliable connection between acoustic wave-form and linguistic messages in terms of individual speech units. According to the characteristic of Chinese phonemes, the base acoustic phoneme units set is decided and refined and a decision tree based state tying approach is explored. Since one of the advantages of top-down tying method is flexibility in maintaining a balance between model accuracy and complexity, relevant adjustments are conducted, such as the stopping criterion of decision tree node splitting, during which optimal thresholds are captured. Better results are achieved in improving acoustic modeling accuracy as well as minimizing the scale of the model to a trainable extent.