摘要
针对声音效果变化引起的语音声学特性的改变,提出基于声学模型自适应的方法。分析了正常模式下训练的声学模型在识别其他声效模式下语音的表现;根据随机段模型的模型特性,将最大似然线性回归方法引入到随机段模型系统中,并利用自适应后的声学模型来识别对应的声效模式下的语音。在"863-test"测试集上进行的汉语连续语音识别实验显示,正常模式下训练的声学模型识别其他四种声效模式下的语音时,识别精度均有较大程度的下降;而自适应后的系统在识别对应的声效模式的语音时,识别精度有了明显的改观。表明了基于声学模型自适应的方法在解决语音识别中声音效果变化问题上的有效性。
Adaptation of acoustic models is presented to cope with the acoustic variability caused vocal effort variability in Mandarin speech recognition. Acoustic models trained on normal speech are applied to recognize sentences under the remaining four vocal effort modes. The maximum likelihood linear regression adaptation method is extended to the stochastic segment model, and the acoustic models after adaptation are used to recognize speech of corresponding vocal effort mode. Experiments conducted on "863-test" show that there is significant decrease in recognition accuracy in case of mismatched speech models, and the recognition performance can be improved considerably by adaptation. This proves that adaptation of acoustic models is effective in solving the acoustic variability caused vocal effort.
出处
《计算机工程与应用》
CSCD
北大核心
2016年第2期156-160,204,共6页
Computer Engineering and Applications
基金
国家自然科学基金(No.61175066
No.61300124)
河南省基础与前沿技术研究计划资助项目(No.132300410332)
关键词
语音识别
声音效果
自适应
最大似然线性回归
speech recognition
vocal effort
adaptation
maximum likelihood linear regression