摘要
The linear hypothesis is the main disadvantage of maximum likelihood linear re- gression (MLLR). This paper applies the polynomial regression method to model adaptation and establishes a nonlinear model adaptation algorithm using maximum likelihood polynomial regression (MLPR) for robust speech recognition. In this algorithm, the nonlinear relationship between training and testing Gaussian means in every Mel channel is approximated by a set of polynomials and the polynomial coefficients are estimated from adaptation data in test envi- ronment using the expectation- maximization (EM) algorithm and maximum likelihood (ML) criterion. The experimental results show that the second-order polynomial can approximate the actual nonlinear function better and in noise compensation and speaker adaptation, the word error rates of MLPR are significantly lower than those of MLLR. The proposed MLPR algorithm overcomes the limitation of linear hypothesis well and can decrease the impact of noise, speaker and other factors simultaneously. It is especially suitable for joint adaptation of speaker and noise.
The linear hypothesis is the main disadvantage of maximum likelihood linear re- gression (MLLR). This paper applies the polynomial regression method to model adaptation and establishes a nonlinear model adaptation algorithm using maximum likelihood polynomial regression (MLPR) for robust speech recognition. In this algorithm, the nonlinear relationship between training and testing Gaussian means in every Mel channel is approximated by a set of polynomials and the polynomial coefficients are estimated from adaptation data in test envi- ronment using the expectation- maximization (EM) algorithm and maximum likelihood (ML) criterion. The experimental results show that the second-order polynomial can approximate the actual nonlinear function better and in noise compensation and speaker adaptation, the word error rates of MLPR are significantly lower than those of MLLR. The proposed MLPR algorithm overcomes the limitation of linear hypothesis well and can decrease the impact of noise, speaker and other factors simultaneously. It is especially suitable for joint adaptation of speaker and noise.
基金
supported by the 973 Program of China(2002CB312102)
the National Natural Science Foundation of China(60672094)