
基于UBM的发音质量评价算法 被引量:2

Pronunciation Quality Scoring Algorithm Based on Universal Background Model
摘要 将已经成功应用到说话人识别/确认领域中的高斯混合模型和全局背景模型(UBM)引入语音发音质量评价领域,提出一种新的评价英语发音质量的算法。该算法训练出标准发音的全局背景模型。UBM模型描述与音素无关的特征分布,定义段时长归一化的相似度比例对数为音素的发音质量分数,综合得到整旬发音的评分结果。实验证明,在实验室自行采集的非母语语音数据库上,该算法评分与专家评分的相关性达到了0.700,优于其他评分算法。 This paper presents a new algorithm which can assess the pronunciation quality of the English spoken by Chinese students. The new algorithm uses Gaussian Mixture Model(GMM) and Universal Background Model(UBM), which is successfully used in speaker verification. It calculates the duration normalized log-likelihood ratio of each phone as phonemic pronunciation scores. It combines each phonemic score to obtain the overall pronunciation quality. The algorithm is evaluated by using a corpus of non-native speech. Experimental results show that the approach outperforms other assessment algorithms on correlations with expert scores at the sentence level. In the test database, this method obtaitns high correlation(0.700).
出处 《计算机工程》 CAS CSCD 北大核心 2008年第22期207-209,共3页 Computer Engineering
关键词 全局背景模型 对数似然比 高斯混合模型 发音质量评价 Universal Background Model(UBM) log-likelihood ratio Gaussian Mixture ModeI(GMM) pronunciation quality scoring
  • 相关文献


  • 1Neumeyer L, Franco H, Digalakis V. Automatic Scoring of Pronunciation Quality[J]. Speech Communication, 2000, 30(2): 83-93.
  • 2Franco H, Neumeyer L, Digalakis V. Combination of Machine Scores for Automatic Grading of Pronunciation Quality[J]. Speech Communication, 2000, 30(2): 121-130.
  • 3Witt S M, Young S J. Phone-level Pronunciation Scoring and Assessment for Interactive Language Learning[J]. Speech Communication, 2000, 30(2): 95-108.
  • 4梁维谦,王国梁,刘加,刘润生.基于音素的发音质量评价算法[J].清华大学学报(自然科学版),2005,45(1):5-8. 被引量:12
  • 5刘振安,王晋军,孙捷.基于数字串内容识别的用户验证方法研究[J].测控技术,2005,24(9):7-8. 被引量:2
  • 6Reynolds D A, Quatieri T F, Dunn R B. Speaker Verification Using Adapted Gaussian Mixture Models[J]. Digital Signal Processing, 2000, 10(1-3): 19-41.
  • 7Steve Y, Evermann G; Kershaw D. The HTK Book(for HTK Version 3.2)[D]. Cambridge: Engineering Department of Cambridge University, 2002:134-143.


  • 1Witt S M.Use of Speech Recognition in Computer—Assisted Language Learning[D].Cambridge:The University of Cambridge,1999.
  • 2Franco H,Neumeyer L,Digalakis V,et al.Combination of machine scores for automatic grading of pronunciation quality[J].Speech Communication,2000,(2—3):121—130.
  • 3Kawai G,Hirose K. A method for measuring the intelligibility and nonnativeness of phone quality in foreign language pronunciation training I-A]. Proceedings of ICSLP[C]. Sydney: IEEE, 1998. 1823- 1826.
  • 4Tomokiyo M L. Recognizing Nonnative Speech :Characterizing and Adapting to Non-native Usage in Speech Recognition [D]. Pittsburgh: Carnegie Mellon University,2001.
  • 5Young S, Evermann G, Kershaw D, et al. The HTK Book(for HTK Version 3.2) [EB/OL]. http: //htk. eng. cam.ae. uk/, 2002.
  • 6Weide R L. The CMU Pronouncing Dictionary [EB/OL].http: //www. speech, cs. cmu. edu/cgi-bin/cmudict, 1998.
  • 7Li Qi, Juang Biing-Hwang, et al. Automatic verbal information verification for user authentication [ J ]. IEEE Trans. on Speech and Audio Processing, 2000,8 (5).
  • 8张玲华,杨震,郑宝玉.基于HMM的说话人辨认系统及其改进[J].电讯技术,2003,43(6):86-89. 被引量:3











使用帮助 返回顶部