Based on W-disjoint orthogonality of speech mixtures, a space d,scnmlnative tunetlon was proposer1 to enumerate and localize competing speakers in the surrounding environments. Then, a Wiener-like postfiherer was deve...Based on W-disjoint orthogonality of speech mixtures, a space d,scnmlnative tunetlon was proposer1 to enumerate and localize competing speakers in the surrounding environments. Then, a Wiener-like postfiherer was developed to adaptively suppress interferences. Experimental results with a hands-free speech recognizer under various SNR and competing speakers settings show that nearly 69 % error reduction can be obtained with a two-channel small aperture microphone array against the conventional single microphone baseline system. Comparisons were made against traditional delay-and-sum and Griffiths-Jim adaptive beamforming techniques to further assess the effectiveness of this method.展开更多
The performance of automatic speech recognizer degrades seriously when there are mismatches between the training and testing conditions. Vector Taylor Series (VTS) approach has been used to compensate mismatches cau...The performance of automatic speech recognizer degrades seriously when there are mismatches between the training and testing conditions. Vector Taylor Series (VTS) approach has been used to compensate mismatches caused by additive noise and convolutive channel distortion in the cepstral domain, in this paper, the conventional VTS is extended by incorporating noise clustering into its EM iteration procedure, improving its compensation effectiveness under non-stationary noisy environments. Recognition experiments under babble and exhibition noisy environments demonstrate that the new algorithm achieves 35% average error rate reduction compared with the conventional VTS.展开更多
文摘Based on W-disjoint orthogonality of speech mixtures, a space d,scnmlnative tunetlon was proposer1 to enumerate and localize competing speakers in the surrounding environments. Then, a Wiener-like postfiherer was developed to adaptively suppress interferences. Experimental results with a hands-free speech recognizer under various SNR and competing speakers settings show that nearly 69 % error reduction can be obtained with a two-channel small aperture microphone array against the conventional single microphone baseline system. Comparisons were made against traditional delay-and-sum and Griffiths-Jim adaptive beamforming techniques to further assess the effectiveness of this method.
文摘The performance of automatic speech recognizer degrades seriously when there are mismatches between the training and testing conditions. Vector Taylor Series (VTS) approach has been used to compensate mismatches caused by additive noise and convolutive channel distortion in the cepstral domain, in this paper, the conventional VTS is extended by incorporating noise clustering into its EM iteration procedure, improving its compensation effectiveness under non-stationary noisy environments. Recognition experiments under babble and exhibition noisy environments demonstrate that the new algorithm achieves 35% average error rate reduction compared with the conventional VTS.