摘要
在发音质量评测研究中,传统仅用发音标准的数据进行声学建模,难以描述实际测试面临的非标准发音,使得训练与测试的失配在所难免。针对上述问题,该文提出一种利用覆盖各种发音的数据,根据最小化机器分与人工分均方误差准则进行声学模型优化的算法。实验在普通话水平考试现场3 685份数据(其中498份测试,3 187份训练)上进行。实验表明采用优化算法得到的针对发音质量的评测声学模型相比传统建模方式得到的声学模型有显著的优势。
Traditional approach uses only the standard-pronounced speech data to build acoustic models, which makes automatic pronunciation systems poor show for accented speech data since the training and test are mismatch. To deal with the problem, this paper presents a novel algorithm that utilizes both standard and accented speech data to optimize acoustic model by minimizing the root mean square error between the manual and the machine scores. Experiments on 3 685 live Putonghua database (498 for test and 3 187 for training) shows that the evalualion acous tic models generated by the proposed method are significantly better than those by traditional approaches.
出处
《中文信息学报》
CSCD
北大核心
2013年第1期98-107,共10页
Journal of Chinese Information Processing
关键词
计算机辅助学习
区分性训练
普通话水平测试
发音质量评测
computer assisted language learning
discriminative training
PSC
pronunciation quality evaluation