摘要
蒙古语语言中非词首音节短元音位置不确定产生了一词多音、构词音变、协同发音以及口语语流等现象,导致声学模型自适应性差。通过使用小规模的自适应数据集,结合MLLR和MAP建模方法,从τ值的选取和自适应声学模型建模的训练过程两方面对基本蒙古语声学模型的自适应性开展研究,给出了一种适合构建自适应蒙古语语音识别声学模型的MLLR-MAP方法。在Sphinx语音识别实验平台上进行建模实验,使用声学模型识别率与系统识别率评价指标对MAP、MLLR、MAP-MLLR和MLLR-MAP等建模方法进行评价。实验结果表明,在声学模型的总正确率、错误率和准确率三个评价指标上都得到了提升,明显优于基线模型。
In order to solve the problem of poor adaptability of acoustic model due to the multi-tone,word-changing,co-articulation and spoken language flow caused by the indefinite vowel position of non-lexical syllables in Mongolian language,this paper studied the adaptability of the basic Mongolian acoustic model by using a small-scale adaptive dataset combined with MLLR and MAP modeling methods,both from the selection of τ values and the training process of adaptive acoustic model modeling and provided a suitable MLLR-MAP method for constructing an adaptive Mongolian speech recognition acoustic model. We conducted modeling experiment on Sphinx speech recognition experiment platform and evaluated the modeling methods of MAP,MLLR,MAP-MLLR and MLLR-MAP by using acoustic model recognition rate and system recognition rate evaluation index. The experimental results showed that the Mongolian acoustic model constructed by using MLLR-MAP method had been improved on the average total accuracy rate and the average error rate,which was obviously better than the baseline model.
出处
《计算机应用与软件》
北大核心
2018年第2期167-171,234,共6页
Computer Applications and Software
基金
国家自然科学基金项目(61650205)
内蒙古自治区自然科学基金项目(2014MS0608)
内蒙古工业大学自然科学基金项目(ZD201118,ZS201127).
关键词
MLLRMAP
声学模型
自适应性
蒙古语
语音识别
Maximum likelihood linear regression
Maximum a posteriori
Acoustic model
Speaker adaptation
Mongolian speech recognition