摘要
提出一种新的连续语音的声调评测算法,该算法可应用于计算机辅助语言学习系统和普通话水平测试中的声调评测。考虑到连续语音声调受上下文之间的相互影响,采用三音节单元建立高斯混合模型(Gaussian Mixture Model,GMM),三音节中辅音部分用Spline插值法拟合声调曲线来反映音节间基音频率的转移信息,并利用Fujisaki模型去除语句的语调和说话人个性特征,只对基频曲线中的声调特征建模。实验结果显示,相比于传统方法,采用三音节Spline插值和Fujisaki改进特征的方法使得机器与人工打分的相似度在测试集中分别提高了8.75%和14.09%。
A new algorithm of objective tone evaluation for Chinese mandarin continuous speech is proposed, which can be used for the tone pronunciation training in Computer Assisted Language Learning (CALL) system and the test of Chinese mandarin speech named as Putonghua Shuiping Ceshi (PSC). A syllable's tone is influenced by context in continuous speech. Therefore, it is reasonable to use tri-syllables as basic units to train GMM (Gaussian Mixture Model) of tones. To get the transition information from the previous voiced region to the current one or from the current to the next voiced region, the pitch value of unvoiced region is interpolated with Spline function. Based on the Fujisaki model, only the lexical tone from the F0 contour is extracted to train GMM. The experimental results show that the correlations between subject and object evaluations based on Spline interpolation and Fujisaki model are improved by 8.75% and 14.09% respectively, comparing to the traditional features.
出处
《声学技术》
CSCD
2013年第4期305-311,共7页
Technical Acoustics
基金
国家自然科学基金资助项目(61271360)
苏州市应用基础研究计划资助项目(SYG201230)