自动发音错误检测中基于F_1值最大化的声学模型训练方法被引量：3

Maximum F_1-score acoustic model training for automatic mispronunciation detection

下载PDF

导出

摘要为了提高计算机辅助语言学习中自动发音错误检测系统的性能,提出一种声学模型的区分性训练方法。该方法将经过正确度标注的非母语语音数据库上的发音错误检测的F_1值的最大化作为模型参数的训练准则。采用Sigmoid函数对F_1值函数进行平滑构造目标函数,并利用构造弱意义辅助函数的方法以及扩展Baum-Welch形式的参数更新公式进行优化。提出在模型参数更新与音素门限同时优化的策略保证目标函数增长的单调性。发音错误检测实验表明该方法能够有效地增大训练和测试数据检错的F_1值。同时训练数据和测试数据上的精确度、召回率以及检测正确度都有明显改进。 To improve the performance of automatic mispronunciation detection in computer-assisted language learn- ing, a discriminative acoustic model training method is proposed. The method aims at maximizing the Fl-score of mispronunciation detection results on the annotated non-native speech database. The training objective function is formulated as a smooth form of the Fl-score by using the sigmoid function, and is optimized by using the extended laum-Welch form like updating equations based on the weak-sense auxiliary function method. Simultaneous updating strategy of acoustic models and phone threshold parameters is proposed to ensure monotonicity of the objective function improvement. Mispronunciation detection experiments show that the method is effective in increasing the Fl-score, precision, recall and detection accuracy on both the training and evaluation data set.

作者黄浩王建明哈力旦.阿布都热依木吾守尔.斯拉木

机构地区新疆大学信息科学与工程学院新疆大学电气工程学院

出处《声学学报》 EI CSCD 北大核心 2013年第6期751-758,共8页 Acta Acustica

基金国家自然科学基金(60965002 60865001 61163026) 新疆高校科研计划培育基金(XJEDU2008S15) 新疆大学博士科研启动基金(BS090143)资助

关键词训练方法错误检测声学模型最大化发音 SIGMOID函数模型参数目标函数

分类号 TP391.7 [自动化与计算机技术—计算机应用技术] TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献13

1Witt S M, Young S J. Phone-level pronunciation scoring and assessment for interactive language learning. Speech Communication, 2000; 30(2-3): 95 -108.
2Meng H, Lo Y Y, Wang L, Lau W Y. Deriving salient learn- ers' mispronunciations from cross-language phonological comparisons. In: Proceedings of IEEE Workshop on Au- tomatic Speech Recognition and Understanding (ASRU), Kyoto Japan: IEEE, 2007:437-442.
3Wei S, Hu G P, Hu Y, Wang R H. A new method for mispronunciation detection using support vector machine based on pronunciation space models. Speech Communi- cation, 2009; 51:896-905.
4葛凤培,潘复平,董滨,颜永红.汉语发音质量评估的实验研究[J].声学学报,2010,35(2):261-266. 被引量：12
5Lo W K, Zhang S, Meng H. Automatic derivation of phono- logical rules for mispronunciation detection in a computer- assisted pronunciation training system. In: Proceedings of Interspeech, Makuhari, Japan: ISCA, 2010:765-768.
6Qian X, Soong F, Meng H. Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT). In: Pro- ceedings of Interspeech, Makuhari, Japan: ISCA, 2010: 757-760.
7Luo Dean, Yang X, Wang L. Improvement of segmental mispronunciation detection with prior knowledge extracted from large L2 speech corpus. In: Proceedings of Inter- speech, Florence, Italy: ISCA, 2011:1593- 1596.
8Juang B H, Katagiri S. Discriminative learning for min- imum error classification. IEEE Transactions on Signal Processing, 1992; 40(12): 3043- 3054.
9Bahl L R, Brown P F, Souza P, Mercer R. Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In: Proceedings of ICASSP, Tokyo, Japan: IEEE, 1986:49-52.
10Povey D, Woodland P. Minimum phone error and I-smoothing for improved discriminative training. In: Pro- ceedings of ICASSP, Orlando, USA: 1EEE, 2002:105-108.

二级参考文献15

1董滨,赵庆卫,颜永红.基于共振峰模式的汉语普通话中韵母发音水平客观测试方法的研究[J].声学学报,2007,32(2):122-128. 被引量：16
2Bernstein Jared, Najmi Ami, Ehsani Farzad. Subrashii: Encounters in Japanese Spoken Language Education. CALICO Journal, 1999; 16(3): 361-384.
3Kawai Goh, Hirose Keikichi. A call system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruents. EUROSPEECH. 1997:657-660.
4Kazunori Imoto, Yasushi Tsubota et al. Modeling and automatic detection of English sentence stress for computer- assisted English prosody learning system. ICSLP, 2002: 749-752.
5Yasushi Tsubota, Tatsuya Kawahara, Masatake Dantsuji. Practical use of English pronunciation system for Japanese students in the call classroom. INTERSPEECH, 2004: 1689-1692.
6Sherif Mahdy Abdou, Salah Eldeen Hamid, Mohsen Rashwan, Abdurrahman Samir, Ossama Abdel-Hamid, Mostafa Shahin, Waleed Nazih. Computer aided pronuncia- tion learning system using speech recognition techniques. NTERSPEECH, 2006:1888-Tue1WeS.9.
7Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price. Automatic text-independent pronunciation scoring of foreign language student speech. ICSLP, 1996: 1457-1460.
8Franco H, Neumeyer L, Kim Y, Ronen O. Automatic pronunciation scoring for language instruction. ICASSP, 1997: 1471-1474.
9Neumeyer L, Franco H, Digalakis V, Weintraub M. Automatic scoring of pronunciation quality. Speech Communication, 2000; 30(Issues 2-3): 83-93.
10Franco H, Neumeyer L et al. Combination of machine scores for automatic grading of pronunciation quality. Speech Communication, 2000; 30(Issues 2-3): 121-130.

共引文献11

1邵健,赵庆卫,颜永红.基于鼻韵尾分离的汉语声韵母识别模型[J].声学学报,2010,35(5):587-592. 被引量：3
2张茹,韩纪庆.一种基于音素模型感知度的发音质量评价方法[J].声学学报,2013,38(2):201-207. 被引量：4
3张俊博,严芊,高兴龙,潘复平,冯勇强,邢力力,林春兰,潘接林.基于强制对齐的汉语重复性口吃检测方法研究[J].声学学报,2013,38(3):397-404. 被引量：1
4齐欣,肖云鹏,叶卫平.普通话发音评估性能改进[J].中文信息学报,2013,27(3):48-55. 被引量：2
5ZHANG Junbo,YAN Qian,GAO Xinglong,PAN Fuping,FENG Yongqiang,XING Lili,LIN Chunlan,PAN Jielin.A forced alignment approach to detect Chinese repetitive stuttering[J].Chinese Journal of Acoustics,2013,32(3):309-321.
6ZHANG Long,LI Haifeng,MA Lin,WANG Jianhua.Automatic detection and evaluation of Erhua in the Putonghua proficiency test[J].Chinese Journal of Acoustics,2014,33(1):83-96.
7张珑,李海峰,马琳,王建华.汉语普通话水平测试中儿化音的自动检测与评价[J].声学学报,2014,39(5):639-646. 被引量：2
8黄浩,徐海华,王羡慧,吾守尔.斯拉木.自动发音错误检测中基于最大化F1值准则的区分性特征补偿训练算法[J].电子学报,2015,43(7):1294-1299. 被引量：8
9惠芬芬,万勤,高晓慧,邱莉.7~11岁听障儿童的语速特征研究[J].中国特殊教育,2022(8):40-50. 被引量：2
10柳宗铭,王丽,李军锋,张鹏远.声学发音模型辅助建模的发音错误检测与诊断[J].声学学报,2023,48(1):264-273.

同被引文献22

1POVEY D. Discriminative Training for Large Vocabulary Speech Recognition [ D ]. England: Cambridge University, 2004.
2NORMANDIN Y. Maximum Mutual Information Estimation of Hidden Markov Models[C] //Pro. Of Automatic Speech and Speaker Recognition. HoUand: Kluwer Academic Publishers ,1996: 57-81.
3POVEY D , WOODLAND P C. Minimum Phone Error and I- smoothing for Improved Discriminative Training [ C ]//Proc. of ICASSP. Orlando, USA : IEEE press, 2002 : 105-108.
4HUANG Hao, WANG Jian-ming, Abdureyimu Halidan. Maximum F1-Score Discriminative Training for Automatic Mispronunciation Dtection in Computer - Assisted Language Learning[R]. USA: ISCA,2012: 815-818.
5WITT S M, YOUNG S J. Phone-level Pronunciation Scoring and Assessment for Interactive Language teaming[J]. Speech Communication,2000, 30(2-3) :95-108.
6POVEY D. DiscriminativeTraining for Large Vocabulary Speech Recognition [ D]. England: University Of Cambridge,2004 : 25-34.
7葛凤培,潘复平,董滨,颜永红.汉语发音质量评估的实验研究[J].声学学报,2010,35(2):261-266. 被引量：12
8袁桦,钱彦旻,赵军红,刘加.基于优化检测网络和MLP特征改进发音错误检测的方法[J].清华大学学报（自然科学版）,2012,52(4):557-560. 被引量：2
9米日古力.阿布都热素,艾克白尔.帕塔尔,艾斯卡尔.艾木都拉.基于电话语料的维吾尔连续音素识别[J].通信技术,2012,45(7):54-56. 被引量：4
10安丽丽,吴延年,刘志,刘润生.一种基于检错音网络的发音错误检测新算法[J].电子与信息学报,2012,34(9):2085-2090. 被引量：1

引证文献3

1热米拉.艾山江,黄浩.一种改进的GOP算法在区分性训练的应用[J].通信技术,2014,47(5):508-511. 被引量：1
2柳宗铭,王丽,李军锋,张鹏远.声学发音模型辅助建模的发音错误检测与诊断[J].声学学报,2023,48(1):264-273.
3沈浩,赵毅锋,李晓.水电站智能巡检机器人技术的应用[J].电子科技,2023,36(12):99-102. 被引量：3

二级引证文献4

1于梅.声纹识别中的区分性训练[J].电子技术与软件工程,2017(24):95-95.
2董亚松,侯立群.可变形特征融合网络的设计及在复杂天气电力设备图像处理中的应用[J].电力科学与工程,2024,40(7):1-9.
3杨鹏飞.基于智能控制技术的引水式水电站运行优化[J].珠江水运,2024(14):129-131.
4廖元良.基于人工智能的发电厂自动控制系统的设计与优化[J].电气技术与经济,2024(8):179-180.

1李宏言,黄申,王士进,梁家恩,徐波.基于GMM-UBM和GLDS-SVM的英文发音错误检测方法[J].自动化学报,2010,36(2):332-336. 被引量：3
2李锦,周怀春,於正前.一种多输入多输出系统模型辨识方法及其应用[J].自动化与仪器仪表,2003(4):36-38. 被引量：1
3罗光宣,丁宇征.CRC多项式对数据的检错[J].电脑爱好者,2001(7):85-86.
4万济萍,肖云鹏,叶卫平.错音检测及其在语音教学中的应用综述[J].中文信息学报,2009,23(4):95-102. 被引量：4
5王建明,黄浩,王羡慧.发音错误检错中声学模型训练准则的比较研究[J].新疆大学学报（自然科学版）,2013,30(2):211-217.
6袁桦,史永哲,赵军红,刘加.基于JSM和MLP改进发音错误检测的方法[J].自动化学报,2014,40(12):2815-2823. 被引量：1
7王玉林,郭帆,余敏.英语口语自动评分系统中发音错误的研究[J].计算机应用与软件,2013,30(6):214-217.
8刘华益,林平分.8051单片机串口通信中的检错方法[J].电子元器件应用,2009,11(10):20-21. 被引量：1
9杨雨桦.论如何应用校园网来辅助大学英语教学[J].中国外资,2008(7):206-207.
10张峰,黄超,戴礼荣.普通话发音错误自动检测技术[J].中文信息学报,2010,24(2):110-115. 被引量：3

声学学报

2013年第6期

浏览历史

内容加载中请稍等...

自动发音错误检测中基于F_1值最大化的声学模型训练方法被引量：3

参考文献13

二级参考文献15

共引文献11

同被引文献22

引证文献3

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

自动发音错误检测中基于F_1值最大化的声学模型训练方法 被引量：3

参考文献13

二级参考文献15

共引文献11

同被引文献22

引证文献3

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

自动发音错误检测中基于F_1值最大化的声学模型训练方法被引量：3