摘要
以维吾尔语为例研究自然语料缺乏的民族语言连续语音识别方法。采用HTK通过人工标注的少量语料生成种子模型,引导大语音数据构建声学模型,利用palmkit工具生成统计语言模型,以Julius工具实现连续语音识别。实验用64个维语母语者自由发话的6 400个短句语音建立单音素声学模型,由100 MB文本、6万词词典生成基于词类的3-gram语言模型,测试结果表明,该方法的识别率为72.5%,比单用HTK提高4.2个百分点。
This paper discusses a continuous speech recognition approach for the resource-deficient languages,such as Uyghur.This approach tries a seed acoustic model using small training speech materials and creates final acoustic model using a larger speech materials and its transcription text by leading seed model.HTK is used for training acoustic model,and palmkit is used for creating language model,and the open-source speech recognition software Julius is applied for continuous speech recognition.In the test,the speech data of 6 400 sentences uttered by 64 native Uyghur speakers is used for training acoustic model and 100 MB text materials and a dictionary of 60 000 words are used for creating 3-garm language model based class.Experimental results show the rate of 72.5% for the real time sound recognition compared with the recognition result of 68.3% by HTK tool only.
出处
《计算机工程》
CAS
CSCD
2012年第12期129-131,135,共4页
Computer Engineering
基金
国家自然科学基金资助面上项目(2011211A012
60863008)
新疆维吾尔自治区科技支疆基金资助项目(201091106)
博士启动基金资助项目(BS090144)