This paper describes a method for recognizing Chinese tones in continuous speech. The first and second order differentials of the fundamental frequency logarithmically converted are used as feature parameters. A left-...This paper describes a method for recognizing Chinese tones in continuous speech. The first and second order differentials of the fundamental frequency logarithmically converted are used as feature parameters. A left-to-right hidden Markov modeling with five states, each of which is modeled by a single Gaussian distribution, expresses each of Chinese tones. Non-voiced portions are coded by random values normally distributed to uniformly deal with all the time frames in an utterance. Speaker dependent tone recognition was conducted for ten speakers. The average rate of 81.8% was obtained for these speakers.展开更多
As a kind of statistical method, the technique of Hidden Markov Model (HMM) is widely used for speech recognition. In order to train the HMM to be more effective with much less amount of data, the Subspace Distribut...As a kind of statistical method, the technique of Hidden Markov Model (HMM) is widely used for speech recognition. In order to train the HMM to be more effective with much less amount of data, the Subspace Distribution Clustering Hidden Markov Model (SDCHMM), derived from the Continuous Density Hidden Markov Model (CDHMM), is introduced. With parameter tying, a new method to train SDCHMMs is described. Compared with the conventional training method, an SDCHMM recognizer trained by means of the new method achieves higher accuracy and speed. Experiment results show that the SDCHMM recognizer outperforms the CDHMM recognizer on speech recognition of Chinese digits.展开更多
The previously proposed syllable-synchronous network search (SSNS) algorithm plays a very important role in the word decoding of the continuous Chinese speech recognition and achieves satisfying performance. Several r...The previously proposed syllable-synchronous network search (SSNS) algorithm plays a very important role in the word decoding of the continuous Chinese speech recognition and achieves satisfying performance. Several related key factors that may affect the overall word decoding effect are carefully studied in this paper, including the perfecting of the vocabulary, the big-discount Turing re-estimating of the N-Gram probabilities, and the managing of the searching path buffers. Based on these discussions, corresponding approaches to improving the SSNS algorithm are proposed. Compared with the previous version of SSNS algorithm, the new version decreases the Chinese character error rate (CCER) in the word decoding by 42.1% across a database consisting of a large number of testing sentences (syllable strings).展开更多
After pointed the unreasonableness of the three basic assumptions contained in HMM, we introduce the theory and the advantage of Stochastic najectory Models (STMs) that possibly resolve these problems caused by HMM as...After pointed the unreasonableness of the three basic assumptions contained in HMM, we introduce the theory and the advantage of Stochastic najectory Models (STMs) that possibly resolve these problems caused by HMM assumptions. In STM, the acoustic observations of an acoustic unit are represented as clusters of trajectories in a parameter space.The trajectories are modelled by mixture of probability density functions of random sequence of states. After analyzing the characteristics of Chinese speech, the acoustic units for continuous Chinese speech recognition based on STM are discussed and phone-like units are suggested. The performance of continuous Chinese speech recognition based on STM is studied on VINICS system. The experimental results prove the efficiency of STM and the consistency of phone-like units.展开更多
为了进行连续马尔可夫模型的初值提取,提出一种各类在训练样本空间近似均衡分布的K均值聚类法。在聚类的过程中引入惩罚因子,从而限制过多的训练矢量集中于一个或几个类,使样本空间划分近似均匀。连续马尔可夫模型初值提取实验证明,该...为了进行连续马尔可夫模型的初值提取,提出一种各类在训练样本空间近似均衡分布的K均值聚类法。在聚类的过程中引入惩罚因子,从而限制过多的训练矢量集中于一个或几个类,使样本空间划分近似均匀。连续马尔可夫模型初值提取实验证明,该方法与标准的K均值聚类法、LBG(L inde Buzo G ray)聚类法相比,降低了矢量量化产生的全局失真,各个类在样本空间的分布更加均匀,提高了矢量量化的性能。将该方法用于孤立词识别连续马尔可夫模型的初值提取,可使各个高斯概率密度函数的参数估计更逼近其无偏估计,从而提高了马尔可夫模型初值的可靠性。展开更多
文摘This paper describes a method for recognizing Chinese tones in continuous speech. The first and second order differentials of the fundamental frequency logarithmically converted are used as feature parameters. A left-to-right hidden Markov modeling with five states, each of which is modeled by a single Gaussian distribution, expresses each of Chinese tones. Non-voiced portions are coded by random values normally distributed to uniformly deal with all the time frames in an utterance. Speaker dependent tone recognition was conducted for ten speakers. The average rate of 81.8% was obtained for these speakers.
基金Supported by the National Natural Science Foundation of China (No.60172048)
文摘As a kind of statistical method, the technique of Hidden Markov Model (HMM) is widely used for speech recognition. In order to train the HMM to be more effective with much less amount of data, the Subspace Distribution Clustering Hidden Markov Model (SDCHMM), derived from the Continuous Density Hidden Markov Model (CDHMM), is introduced. With parameter tying, a new method to train SDCHMMs is described. Compared with the conventional training method, an SDCHMM recognizer trained by means of the new method achieves higher accuracy and speed. Experiment results show that the SDCHMM recognizer outperforms the CDHMM recognizer on speech recognition of Chinese digits.
文摘The previously proposed syllable-synchronous network search (SSNS) algorithm plays a very important role in the word decoding of the continuous Chinese speech recognition and achieves satisfying performance. Several related key factors that may affect the overall word decoding effect are carefully studied in this paper, including the perfecting of the vocabulary, the big-discount Turing re-estimating of the N-Gram probabilities, and the managing of the searching path buffers. Based on these discussions, corresponding approaches to improving the SSNS algorithm are proposed. Compared with the previous version of SSNS algorithm, the new version decreases the Chinese character error rate (CCER) in the word decoding by 42.1% across a database consisting of a large number of testing sentences (syllable strings).
文摘After pointed the unreasonableness of the three basic assumptions contained in HMM, we introduce the theory and the advantage of Stochastic najectory Models (STMs) that possibly resolve these problems caused by HMM assumptions. In STM, the acoustic observations of an acoustic unit are represented as clusters of trajectories in a parameter space.The trajectories are modelled by mixture of probability density functions of random sequence of states. After analyzing the characteristics of Chinese speech, the acoustic units for continuous Chinese speech recognition based on STM are discussed and phone-like units are suggested. The performance of continuous Chinese speech recognition based on STM is studied on VINICS system. The experimental results prove the efficiency of STM and the consistency of phone-like units.
文摘为了进行连续马尔可夫模型的初值提取,提出一种各类在训练样本空间近似均衡分布的K均值聚类法。在聚类的过程中引入惩罚因子,从而限制过多的训练矢量集中于一个或几个类,使样本空间划分近似均匀。连续马尔可夫模型初值提取实验证明,该方法与标准的K均值聚类法、LBG(L inde Buzo G ray)聚类法相比,降低了矢量量化产生的全局失真,各个类在样本空间的分布更加均匀,提高了矢量量化的性能。将该方法用于孤立词识别连续马尔可夫模型的初值提取,可使各个高斯概率密度函数的参数估计更逼近其无偏估计,从而提高了马尔可夫模型初值的可靠性。