摘要
Frame erasure concealment is studied to solve the problem of rapid speech quality reduction due to the loss of speech parameters during speech transmission. A large hidden Markov model is applied to model the immittance spectral frequency (ISF) parameters in AMR-WB codec to optimally estimate the lost ISFs based on the minimum mean square error (MMSE) rule. The estimated ISFs are weighted with the ones of their previous neighbors to smooth the speech, resulting in the actual concealed ISF vectors. They are used instead of the lost ISFs in the speech synthesis on the receiver. Comparison is made between the speech concealed by this algorithm and by Annex I of G. 722. 2 specification, and simulation shows that the proposed concealment algorithm can lead to better performance in terms of frequency-weighted spectral distortion and signal-to-noise ratio compared to the baseline method, with an increase of 2.41 dB in signal-to-noise ratio (SNR) and a reduction of 0. 885 dB in frequency-weighted spectral distortion.
研究了在语音传输过程中由于参数丢失导致语音质量急剧下降的丢帧补偿问题.利用大规模隐式马尔可夫模型对自适应多速率宽带语音编码(AMR-WB)的ISF参数进行建模,然后对丢失的ISF参数进行基于最小均方误差(MMSE)准则的最优估计,将估计的ISF参数和前帧的ISF参数进行加权以平滑估计值,得到补偿的ISF参数.在接收端,利用ISF参数的估计值进行语音合成.将本算法的合成语音和由G.722.2标准附件I的基准补偿的合成语音进行比较,仿真结果表明,本补偿算法可以得到更好的性能,在频率加权谱失真和信噪比这2种评价准则上都有所改善,信噪比提高约2.41dB,频率加权谱失真下降约0.885dB,证明了该算法的有效性.
基金
The Science Foundation of Southeast University(No.XJ0704268)
the Natural Science Foundation of the Education Department of Anhui Province(No.KJ2007B088)