期刊文献+

自动发音错误检测中基于F_1值最大化的声学模型训练方法 被引量:3

Maximum F_1-score acoustic model training for automatic mispronunciation detection
下载PDF
导出
摘要 为了提高计算机辅助语言学习中自动发音错误检测系统的性能,提出一种声学模型的区分性训练方法。该方法将经过正确度标注的非母语语音数据库上的发音错误检测的F_1值的最大化作为模型参数的训练准则。采用Sigmoid函数对F_1值函数进行平滑构造目标函数,并利用构造弱意义辅助函数的方法以及扩展Baum-Welch形式的参数更新公式进行优化。提出在模型参数更新与音素门限同时优化的策略保证目标函数增长的单调性。发音错误检测实验表明该方法能够有效地增大训练和测试数据检错的F_1值。同时训练数据和测试数据上的精确度、召回率以及检测正确度都有明显改进。 To improve the performance of automatic mispronunciation detection in computer-assisted language learn- ing, a discriminative acoustic model training method is proposed. The method aims at maximizing the Fl-score of mispronunciation detection results on the annotated non-native speech database. The training objective function is formulated as a smooth form of the Fl-score by using the sigmoid function, and is optimized by using the extended laum-Welch form like updating equations based on the weak-sense auxiliary function method. Simultaneous updating strategy of acoustic models and phone threshold parameters is proposed to ensure monotonicity of the objective function improvement. Mispronunciation detection experiments show that the method is effective in increasing the Fl-score, precision, recall and detection accuracy on both the training and evaluation data set.
出处 《声学学报》 EI CSCD 北大核心 2013年第6期751-758,共8页 Acta Acustica
基金 国家自然科学基金(60965002 60865001 61163026) 新疆高校科研计划培育基金(XJEDU2008S15) 新疆大学博士科研启动基金(BS090143)资助
  • 相关文献

参考文献13

  • 1Witt S M, Young S J. Phone-level pronunciation scoring and assessment for interactive language learning. Speech Communication, 2000; 30(2-3): 95 -108.
  • 2Meng H, Lo Y Y, Wang L, Lau W Y. Deriving salient learn- ers' mispronunciations from cross-language phonological comparisons. In: Proceedings of IEEE Workshop on Au- tomatic Speech Recognition and Understanding (ASRU), Kyoto Japan: IEEE, 2007:437-442.
  • 3Wei S, Hu G P, Hu Y, Wang R H. A new method for mispronunciation detection using support vector machine based on pronunciation space models. Speech Communi- cation, 2009; 51:896-905.
  • 4葛凤培,潘复平,董滨,颜永红.汉语发音质量评估的实验研究[J].声学学报,2010,35(2):261-266. 被引量:12
  • 5Lo W K, Zhang S, Meng H. Automatic derivation of phono- logical rules for mispronunciation detection in a computer- assisted pronunciation training system. In: Proceedings of Interspeech, Makuhari, Japan: ISCA, 2010:765-768.
  • 6Qian X, Soong F, Meng H. Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT). In: Pro- ceedings of Interspeech, Makuhari, Japan: ISCA, 2010: 757-760.
  • 7Luo Dean, Yang X, Wang L. Improvement of segmental mispronunciation detection with prior knowledge extracted from large L2 speech corpus. In: Proceedings of Inter- speech, Florence, Italy: ISCA, 2011:1593- 1596.
  • 8Juang B H, Katagiri S. Discriminative learning for min- imum error classification. IEEE Transactions on Signal Processing, 1992; 40(12): 3043- 3054.
  • 9Bahl L R, Brown P F, Souza P, Mercer R. Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In: Proceedings of ICASSP, Tokyo, Japan: IEEE, 1986:49-52.
  • 10Povey D, Woodland P. Minimum phone error and I-smoothing for improved discriminative training. In: Pro- ceedings of ICASSP, Orlando, USA: 1EEE, 2002:105-108.

二级参考文献15

  • 1董滨,赵庆卫,颜永红.基于共振峰模式的汉语普通话中韵母发音水平客观测试方法的研究[J].声学学报,2007,32(2):122-128. 被引量:16
  • 2Bernstein Jared, Najmi Ami, Ehsani Farzad. Subrashii: Encounters in Japanese Spoken Language Education. CALICO Journal, 1999; 16(3): 361-384.
  • 3Kawai Goh, Hirose Keikichi. A call system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruents. EUROSPEECH. 1997:657-660.
  • 4Kazunori Imoto, Yasushi Tsubota et al. Modeling and automatic detection of English sentence stress for computer- assisted English prosody learning system. ICSLP, 2002: 749-752.
  • 5Yasushi Tsubota, Tatsuya Kawahara, Masatake Dantsuji. Practical use of English pronunciation system for Japanese students in the call classroom. INTERSPEECH, 2004: 1689-1692.
  • 6Sherif Mahdy Abdou, Salah Eldeen Hamid, Mohsen Rashwan, Abdurrahman Samir, Ossama Abdel-Hamid, Mostafa Shahin, Waleed Nazih. Computer aided pronuncia- tion learning system using speech recognition techniques. NTERSPEECH, 2006:1888-Tue1WeS.9.
  • 7Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price. Automatic text-independent pronunciation scoring of foreign language student speech. ICSLP, 1996: 1457-1460.
  • 8Franco H, Neumeyer L, Kim Y, Ronen O. Automatic pronunciation scoring for language instruction. ICASSP, 1997: 1471-1474.
  • 9Neumeyer L, Franco H, Digalakis V, Weintraub M. Automatic scoring of pronunciation quality. Speech Communication, 2000; 30(Issues 2-3): 83-93.
  • 10Franco H, Neumeyer L et al. Combination of machine scores for automatic grading of pronunciation quality. Speech Communication, 2000; 30(Issues 2-3): 121-130.

共引文献11

同被引文献22

  • 1POVEY D. Discriminative Training for Large Vocabulary Speech Recognition [ D ]. England: Cambridge University, 2004.
  • 2NORMANDIN Y. Maximum Mutual Information Estimation of Hidden Markov Models[C] //Pro. Of Automatic Speech and Speaker Recognition. HoUand: Kluwer Academic Publishers ,1996: 57-81.
  • 3POVEY D , WOODLAND P C. Minimum Phone Error and I- smoothing for Improved Discriminative Training [ C ]//Proc. of ICASSP. Orlando, USA : IEEE press, 2002 : 105-108.
  • 4HUANG Hao, WANG Jian-ming, Abdureyimu Halidan. Maximum F1-Score Discriminative Training for Automatic Mispronunciation Dtection in Computer - Assisted Language Learning[R]. USA: ISCA,2012: 815-818.
  • 5WITT S M, YOUNG S J. Phone-level Pronunciation Scoring and Assessment for Interactive Language teaming[J]. Speech Communication,2000, 30(2-3) :95-108.
  • 6POVEY D. DiscriminativeTraining for Large Vocabulary Speech Recognition [ D]. England: University Of Cambridge,2004 : 25-34.
  • 7葛凤培,潘复平,董滨,颜永红.汉语发音质量评估的实验研究[J].声学学报,2010,35(2):261-266. 被引量:12
  • 8袁桦,钱彦旻,赵军红,刘加.基于优化检测网络和MLP特征改进发音错误检测的方法[J].清华大学学报(自然科学版),2012,52(4):557-560. 被引量:2
  • 9米日古力.阿布都热素,艾克白尔.帕塔尔,艾斯卡尔.艾木都拉.基于电话语料的维吾尔连续音素识别[J].通信技术,2012,45(7):54-56. 被引量:4
  • 10安丽丽,吴延年,刘志,刘润生.一种基于检错音网络的发音错误检测新算法[J].电子与信息学报,2012,34(9):2085-2090. 被引量:1

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部