Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning...Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning, we aim to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach towards gradation that is motivated by auditory perception. We have developed a computational method for generating a perceptual distance(PD) between two spoken phonemes. This is used to compute the auditory confusion of native language(L1). PD is found to correlate well with the mispronunciations detected in CAPT system for Chinese learners of English,i.e., L1 being Chinese(Mandarin and Cantonese) and L2 being US English. The results show that auditory confusion is indicative of pronunciation confusions in L2 learning. PD can also be used to help us grade the severity of errors(i.e.,mispronunciations that confuse more distant phonemes are more severe) and accordingly prioritize the order of corrective feedback generated for the learners.展开更多
基金supported by the National Basic Research 973 Program of China under Grant No.2013CB329304the National Natural Science Foundation of China under Grant No.61370023+2 种基金the Major Project of the National Social Science Foundation of China under Grant No.13&ZD189partially supported by the General Research Fund of the Hong Kong SAR Government under Project No.415511the CUHK Teaching Development Grant
文摘Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning, we aim to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach towards gradation that is motivated by auditory perception. We have developed a computational method for generating a perceptual distance(PD) between two spoken phonemes. This is used to compute the auditory confusion of native language(L1). PD is found to correlate well with the mispronunciations detected in CAPT system for Chinese learners of English,i.e., L1 being Chinese(Mandarin and Cantonese) and L2 being US English. The results show that auditory confusion is indicative of pronunciation confusions in L2 learning. PD can also be used to help us grade the severity of errors(i.e.,mispronunciations that confuse more distant phonemes are more severe) and accordingly prioritize the order of corrective feedback generated for the learners.