Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning...Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning, we aim to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach towards gradation that is motivated by auditory perception. We have developed a computational method for generating a perceptual distance(PD) between two spoken phonemes. This is used to compute the auditory confusion of native language(L1). PD is found to correlate well with the mispronunciations detected in CAPT system for Chinese learners of English,i.e., L1 being Chinese(Mandarin and Cantonese) and L2 being US English. The results show that auditory confusion is indicative of pronunciation confusions in L2 learning. PD can also be used to help us grade the severity of errors(i.e.,mispronunciations that confuse more distant phonemes are more severe) and accordingly prioritize the order of corrective feedback generated for the learners.展开更多
How to cultivate innovative talents has become an important educational issue nowadays.In China’s long-term mentorship education environment,supervisor-student relationship often affects students’creativity.From the...How to cultivate innovative talents has become an important educational issue nowadays.In China’s long-term mentorship education environment,supervisor-student relationship often affects students’creativity.From the perspective of students’psychology,we explore the influence mechanism of supervisor-student relationship on creativity by machine learning and questionnaire survey.In Study 1,based on video interviews with 16 postgraduate students,we use the machine learning method to analyze the emotional states exhibited by the postgraduate students in the videos when associating them with the supervisor-student interaction scenario,finding that students have negative emotions in bad supervisor-student relationship.Subsequently,we further explore the impact of supervisor-student relationship on postgraduate students’development in supervisor-student interaction scenarios at the affective level.In Study 2,a questionnaire survey is conducted to explore the relationship between relevant variables,finding that a good supervisor-student relationship can significantly reduce power stereotype threat,decrease emotional labor surface behaviors,and promote creativity expression.The above results theoretically reveal the internal psychological processes by which supervisor-student relationship affects creativity,and have important implications for reducing emotional labor and enhancing creativity expression of postgraduate students.展开更多
The fusion technique is the key to the multimodal emotion recognition task.Recently,cross-modal attention-based fusion methods have demonstrated high performance and strong robustness.However,cross-modal attention suf...The fusion technique is the key to the multimodal emotion recognition task.Recently,cross-modal attention-based fusion methods have demonstrated high performance and strong robustness.However,cross-modal attention suffers from redundant features and does not capture complementary features well.We find that it is not necessary to use the entire information of one modality to reinforce the other during cross-modal interaction,and the features that can reinforce a modality may contain only a part of it.To this end,we design an innovative Transformer-based Adaptive Cross-modal Fusion Network(TACFN).Specifically,for the redundant features,we make one modality perform intra-modal feature selection through a self-attention mechanism,so that the selected features can adaptively and efficiently interact with another modality.To better capture the complementary information between the modalities,we obtain the fused weight vector by splicing and use the weight vector to achieve feature reinforcement of the modalities.We apply TCAFN to the RAVDESS and IEMOCAP datasets.For fair comparison,we use the same unimodal representations to validate the effectiveness of the proposed fusion method.The experimental results show that TACFN brings a significant performance improvement compared to other methods and reaches the state-of-the-art performance.All code and models could be accessed from https://github.com/shuzihuaiyu/TACFN.展开更多
基金supported by the National Basic Research 973 Program of China under Grant No.2013CB329304the National Natural Science Foundation of China under Grant No.61370023+2 种基金the Major Project of the National Social Science Foundation of China under Grant No.13&ZD189partially supported by the General Research Fund of the Hong Kong SAR Government under Project No.415511the CUHK Teaching Development Grant
文摘Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning, we aim to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach towards gradation that is motivated by auditory perception. We have developed a computational method for generating a perceptual distance(PD) between two spoken phonemes. This is used to compute the auditory confusion of native language(L1). PD is found to correlate well with the mispronunciations detected in CAPT system for Chinese learners of English,i.e., L1 being Chinese(Mandarin and Cantonese) and L2 being US English. The results show that auditory confusion is indicative of pronunciation confusions in L2 learning. PD can also be used to help us grade the severity of errors(i.e.,mispronunciations that confuse more distant phonemes are more severe) and accordingly prioritize the order of corrective feedback generated for the learners.
文摘How to cultivate innovative talents has become an important educational issue nowadays.In China’s long-term mentorship education environment,supervisor-student relationship often affects students’creativity.From the perspective of students’psychology,we explore the influence mechanism of supervisor-student relationship on creativity by machine learning and questionnaire survey.In Study 1,based on video interviews with 16 postgraduate students,we use the machine learning method to analyze the emotional states exhibited by the postgraduate students in the videos when associating them with the supervisor-student interaction scenario,finding that students have negative emotions in bad supervisor-student relationship.Subsequently,we further explore the impact of supervisor-student relationship on postgraduate students’development in supervisor-student interaction scenarios at the affective level.In Study 2,a questionnaire survey is conducted to explore the relationship between relevant variables,finding that a good supervisor-student relationship can significantly reduce power stereotype threat,decrease emotional labor surface behaviors,and promote creativity expression.The above results theoretically reveal the internal psychological processes by which supervisor-student relationship affects creativity,and have important implications for reducing emotional labor and enhancing creativity expression of postgraduate students.
基金supported by Beijing Key Laboratory of Behavior and Mental Health,Peking University。
文摘The fusion technique is the key to the multimodal emotion recognition task.Recently,cross-modal attention-based fusion methods have demonstrated high performance and strong robustness.However,cross-modal attention suffers from redundant features and does not capture complementary features well.We find that it is not necessary to use the entire information of one modality to reinforce the other during cross-modal interaction,and the features that can reinforce a modality may contain only a part of it.To this end,we design an innovative Transformer-based Adaptive Cross-modal Fusion Network(TACFN).Specifically,for the redundant features,we make one modality perform intra-modal feature selection through a self-attention mechanism,so that the selected features can adaptively and efficiently interact with another modality.To better capture the complementary information between the modalities,we obtain the fused weight vector by splicing and use the weight vector to achieve feature reinforcement of the modalities.We apply TCAFN to the RAVDESS and IEMOCAP datasets.For fair comparison,we use the same unimodal representations to validate the effectiveness of the proposed fusion method.The experimental results show that TACFN brings a significant performance improvement compared to other methods and reaches the state-of-the-art performance.All code and models could be accessed from https://github.com/shuzihuaiyu/TACFN.