In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language proc...In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language processing. The speaker independently continuous speech recognition experiments and the part-of-speech tagging experiments show that Markov family model has higher performance than hidden Markov model. The precision is enhanced from 94.642% to 96.214% in the part-of-speech tagging experiments, and the work rate is reduced by 11.9% in the speech recognition experiments with respect to HMM baseline system.展开更多
Based on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were ...Based on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were carried out. The first kind consists of isolated word recognition experiments in neutral (non-emotional) speech. The results show that the ZCMT approach effectively improves the recognition accuracy by 3.47% in average compared with the Teager energy operator (TEO). Thus, ZCMT feature can be considered as a noise-robust feature for speech recognition. The second kind consists of mono-lingual emotion recognition experiments by using the Taiyuan University of Technology (TYUT) and the Berlin databases. As the average recognition rate of ZCMT approach is 82.19%, the results indicate that the ZCMT features can characterize speech emotions in an effective way. The third kind consists of cross-lingual experiments with three languages. As the accuracy of ZCMT approach only reduced by 1.45%, the results indicate that the ZCMT features can characterize emotions in a language independent way.展开更多
The input of a network is the key problem for Chinese word sense disambiguation utilizing the neural network. This paper presents an input model of the neural network that calculates the mutual information between con...The input of a network is the key problem for Chinese word sense disambiguation utilizing the neural network. This paper presents an input model of the neural network that calculates the mutual information between contextual words and the ambiguous word by using statistical methodology and taking the contextual words of a certain number beside the ambiguous word according to (-M,+N).The experiment adopts triple-layer BP Neural Network model and proves how the size of a training set and the value of Mand Naffect the performance of the Neural Network Model. The experimental objects are six pseudowords owning three word-senses constructed according to certain principles. The tested accuracy of our approach on a closed-corpus reaches 90.31%, and 89.62% on an open-corpus. The experiment proves that the Neural Network Model has a good performance on Word Sense Disambiguation.展开更多
Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isol...Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isolation. However, this approach neglects logical constraints between them. We therefore exploite integrating predicate sense disambiguation with the latter two subtasks respectively, which verifies that the automatic predicate sense disambiguation could help the se- mantic role labeling task. In addition, a dual de- composition algorithm is used to alleviate the er- ror propagation between argument identification subtask and argument classification subtask by benefitting the argument identification subtask greatly. Experiment results show that our ap- proach leads to a better performance with PA a- nalysis than other pipeline approaches.展开更多
L2 listening comprehension is a cognitive process, in which listeners use both bottom-up and top-down processing to comprehend the aural text. This study explores whether Chinese EFL learners' listening problems a...L2 listening comprehension is a cognitive process, in which listeners use both bottom-up and top-down processing to comprehend the aural text. This study explores whether Chinese EFL learners' listening problems are more associated with their bottom-up or top-down processing. A questionnaire and students' verbal report were conducted. The results show that both high-and low-proficient listeners are actively engaged in top-down processing, but their degree of comprehension depends, to a large extent, on their success in bottom-up processing. This study appeals for more focus shifted back to training Chinese EFL learners' bottom-up processing ability.展开更多
This study investigates word recognition processes and strategies of intermediate learners of Chinese as a Second Language (CSL) in contextual reading settings. Two intermediate CSL learners were chosen as research ...This study investigates word recognition processes and strategies of intermediate learners of Chinese as a Second Language (CSL) in contextual reading settings. Two intermediate CSL learners were chosen as research participants, and think-aloud methods and retrospective interviews were used to collect data. The data were analyzed by using Moustakas' data analysis procedure, CresweU's three steps and Bogdon and Biklen's data analysis methods. Results indicated that intermediate CSL learners go through different processes of word recognition as it might be automatic, based on context, pronunciation, previous knowledge and the meaning of characters, or, in case of word recognition failure, skipping the words or skipping them but reading them again later; and their word recognition strategies in contextual reading settings mainly include cognitive strategies and self-regulatory strategies. Among these strategies, cognitive strategies consist of direct transformation, translation, interpretation, guessing, inferring and finding key words; and self-regulatory strategies include metacognitive strategies, behavior regulating strategies, emotion regulating strategies and motivation regulating strategies. A model of intermediate CSL learners' word recognition strategies can be constructed based on the results. The present study provides both theoretical and pedagogical implications in the field of CSL vocabulary acquisition and teaching.展开更多
We propose a heterogeneous, mid-level feature based method for recognizing natural scene categories. The proposed feature introduces spatial information among the latent topics by means of spatial pyramid, while the l...We propose a heterogeneous, mid-level feature based method for recognizing natural scene categories. The proposed feature introduces spatial information among the latent topics by means of spatial pyramid, while the latent topics are obtained by using probabilistic latent semantic analysis (pLSA) based on the bag-of-words representation. The proposed feature always performs better than standard pLSA because the performance of pLSA is adversely affected in many cases due to the loss of spatial information. By combining various interest point detectors and local region descriptors used in the bag-of-words model, the proposed feature can make further improvement for diverse scene category recognition tasks. We also propose a two-stage framework for multi-class classification. In the first stage, for each of possible detector/descriptor pairs, adaptive boosting classifiers are employed to select the most discriminative topics and further compute posterior probabilities of an unknown image from those selected topics. The second stage uses the prod-max rule to combine information coming from multiple sources and assigns the unknown image to the scene category with the highest 'final' posterior probability. Experimental results on three benchmark scene datasets show that the proposed method exceeds most state-of-the-art methods.展开更多
基金Project(60763001)supported by the National Natural Science Foundation of ChinaProjects(2009GZS0027,2010GZS0072)supported by the Natural Science Foundation of Jiangxi Province,China
文摘In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language processing. The speaker independently continuous speech recognition experiments and the part-of-speech tagging experiments show that Markov family model has higher performance than hidden Markov model. The precision is enhanced from 94.642% to 96.214% in the part-of-speech tagging experiments, and the work rate is reduced by 11.9% in the speech recognition experiments with respect to HMM baseline system.
基金Project(61072087)supported by the National Natural Science Foundation of ChinaProject(2010011020-1)supported by the Natural Scientific Foundation of Shanxi Province,ChinaProject(20093010)supported by Graduate Innovation Fundation of Shanxi Province,China
文摘Based on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were carried out. The first kind consists of isolated word recognition experiments in neutral (non-emotional) speech. The results show that the ZCMT approach effectively improves the recognition accuracy by 3.47% in average compared with the Teager energy operator (TEO). Thus, ZCMT feature can be considered as a noise-robust feature for speech recognition. The second kind consists of mono-lingual emotion recognition experiments by using the Taiyuan University of Technology (TYUT) and the Berlin databases. As the average recognition rate of ZCMT approach is 82.19%, the results indicate that the ZCMT features can characterize speech emotions in an effective way. The third kind consists of cross-lingual experiments with three languages. As the accuracy of ZCMT approach only reduced by 1.45%, the results indicate that the ZCMT features can characterize emotions in a language independent way.
文摘The input of a network is the key problem for Chinese word sense disambiguation utilizing the neural network. This paper presents an input model of the neural network that calculates the mutual information between contextual words and the ambiguous word by using statistical methodology and taking the contextual words of a certain number beside the ambiguous word according to (-M,+N).The experiment adopts triple-layer BP Neural Network model and proves how the size of a training set and the value of Mand Naffect the performance of the Neural Network Model. The experimental objects are six pseudowords owning three word-senses constructed according to certain principles. The tested accuracy of our approach on a closed-corpus reaches 90.31%, and 89.62% on an open-corpus. The experiment proves that the Neural Network Model has a good performance on Word Sense Disambiguation.
文摘Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isolation. However, this approach neglects logical constraints between them. We therefore exploite integrating predicate sense disambiguation with the latter two subtasks respectively, which verifies that the automatic predicate sense disambiguation could help the se- mantic role labeling task. In addition, a dual de- composition algorithm is used to alleviate the er- ror propagation between argument identification subtask and argument classification subtask by benefitting the argument identification subtask greatly. Experiment results show that our ap- proach leads to a better performance with PA a- nalysis than other pipeline approaches.
文摘L2 listening comprehension is a cognitive process, in which listeners use both bottom-up and top-down processing to comprehend the aural text. This study explores whether Chinese EFL learners' listening problems are more associated with their bottom-up or top-down processing. A questionnaire and students' verbal report were conducted. The results show that both high-and low-proficient listeners are actively engaged in top-down processing, but their degree of comprehension depends, to a large extent, on their success in bottom-up processing. This study appeals for more focus shifted back to training Chinese EFL learners' bottom-up processing ability.
文摘This study investigates word recognition processes and strategies of intermediate learners of Chinese as a Second Language (CSL) in contextual reading settings. Two intermediate CSL learners were chosen as research participants, and think-aloud methods and retrospective interviews were used to collect data. The data were analyzed by using Moustakas' data analysis procedure, CresweU's three steps and Bogdon and Biklen's data analysis methods. Results indicated that intermediate CSL learners go through different processes of word recognition as it might be automatic, based on context, pronunciation, previous knowledge and the meaning of characters, or, in case of word recognition failure, skipping the words or skipping them but reading them again later; and their word recognition strategies in contextual reading settings mainly include cognitive strategies and self-regulatory strategies. Among these strategies, cognitive strategies consist of direct transformation, translation, interpretation, guessing, inferring and finding key words; and self-regulatory strategies include metacognitive strategies, behavior regulating strategies, emotion regulating strategies and motivation regulating strategies. A model of intermediate CSL learners' word recognition strategies can be constructed based on the results. The present study provides both theoretical and pedagogical implications in the field of CSL vocabulary acquisition and teaching.
基金Project supported by the Fundamental Research Funds for the Central Universities,China(No.lzujbky-2013-41)the National Natural Science Foundation of China(No.61201446)the Basic Scientific Research Business Expenses of the Central University and Open Project of Key Laboratory for Magnetism and Magnetic Materials of the Ministry of Education,Lanzhou University(No.LZUMMM2015010)
文摘We propose a heterogeneous, mid-level feature based method for recognizing natural scene categories. The proposed feature introduces spatial information among the latent topics by means of spatial pyramid, while the latent topics are obtained by using probabilistic latent semantic analysis (pLSA) based on the bag-of-words representation. The proposed feature always performs better than standard pLSA because the performance of pLSA is adversely affected in many cases due to the loss of spatial information. By combining various interest point detectors and local region descriptors used in the bag-of-words model, the proposed feature can make further improvement for diverse scene category recognition tasks. We also propose a two-stage framework for multi-class classification. In the first stage, for each of possible detector/descriptor pairs, adaptive boosting classifiers are employed to select the most discriminative topics and further compute posterior probabilities of an unknown image from those selected topics. The second stage uses the prod-max rule to combine information coming from multiple sources and assigns the unknown image to the scene category with the highest 'final' posterior probability. Experimental results on three benchmark scene datasets show that the proposed method exceeds most state-of-the-art methods.