Language training for nonfluent aphasia (NFA) patients rmay increase their verbal expression of unfamiliarwords. Some therapies aimed at the improvement of cognitive functions can facilitate the recovery of NFA pati...Language training for nonfluent aphasia (NFA) patients rmay increase their verbal expression of unfamiliarwords. Some therapies aimed at the improvement of cognitive functions can facilitate the recovery of NFA patients' damaged linguistic functions. Some studies have shown that with music cues NFA patients could fluently sing their familiar songs but could not read the lyrics, consistent with studies of proverbs and prayer) Our previous research has shown that highly related voice cues can improve NFA patients' verbal expression.2 These results indicate that the improvement of NFA patients' speech production may benefit from regaining the extraction of phonological encoding that has already been preserved in memory, rather than re-study of the language.展开更多
Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain an...Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain and realizing it in an artificial system. A physiological research study based on electromyographic signals (Honda, 1996) suggested that speech communication in human brain might be based on a topological mapping between speech production and perception, according to an analogous topology between motor and sensory representations. Following this hypothesis, this study first investigated the topologies of the vowel system across the motor, kinematic, and acoustic spaces by means of a model simulation, and then examined the linkage between vowel production and perception in terms of a transformed auditory feedback (TAF) experiment. The model simulation indicated that there exists an invariant mapping from muscle activations (motor space) to articulations (kinematic space) via a coordinate consisting of force-dependent equilibrium positions, and the mapping from the motor space to kinematic space is unique. The motor-kinematic-acoustic deduction in the model simulation showed that the topologies were compatible from one space to another. In the TAF experiment, vowel production exhibited a compensatory response for a perturbation in the feedback sound. This implied that vowel production is controlled in reference to perception monitoring.展开更多
The experiment presented in this research is targeting a 'positional' stage of a 'modular' model of speech production originally proposed by Levelt (1989), Bock & Levelt (1994), where selected lemmas are inse...The experiment presented in this research is targeting a 'positional' stage of a 'modular' model of speech production originally proposed by Levelt (1989), Bock & Levelt (1994), where selected lemmas are inserted into syntactic frames. Results suggest a difference between L1 and L2 English speakers at the positional stage. While this might suggest that the speech planning process is different in native and non-native speakers, an alternative view is also proposed that the observed differences are the result of differences in the way that linguistic forms are stored, rather than a fundamental difference in the way that speech is planned. This result indicates main verb, copula be & local dependency effect are the three elements that affect the realization of English subject-verb agreement, and helps us locate the phase where L2 subject-verb agreement errors happen.展开更多
A method to synthesize formant targeted sounds based on speech production model and Reflection-Type Line Analog (RTLA) articulatory synthesis model is presented. The synthesis model is implemented with scattering pro...A method to synthesize formant targeted sounds based on speech production model and Reflection-Type Line Analog (RTLA) articulatory synthesis model is presented. The synthesis model is implemented with scattering process derived from a RTLA of vocal tract system according to the acoustic mechanism of speech production. The vocal-tract area function which controls the synthesis model is derived from the first three formant trajectories by using the inverse solution of speech production. The proposed method not only gives good naturalness and dynamic smoothness, but also is capable to control or modify speech timbres easily and flexibly. Further and mores it needs less number of control parameters and very low update rate of the parameters.展开更多
This research studies the features of chest and abdominal breathing in Zhuang language.Two participants were recruited to record 30 news articles of Zhuang language.The chest and abdominal breathing signals as well as...This research studies the features of chest and abdominal breathing in Zhuang language.Two participants were recruited to record 30 news articles of Zhuang language.The chest and abdominal breathing signals as well as speech signal were recorded simultaneously. Programs for breathing analysis have been written to extract parameters such as breathing reset amplitude, time of inhale phase, and slope of exhale phase. The results show that the times of inhale and exhale reset of abdominal breathing are earlier than chest breathing, the breathing reset is related to prosodic boundaries展开更多
The present system experimentally demonstrates a synthesis of syllables and words from tongue manoeuvers in multiple languages,captured by four oral sensors only.For an experimental demonstration of the system used in...The present system experimentally demonstrates a synthesis of syllables and words from tongue manoeuvers in multiple languages,captured by four oral sensors only.For an experimental demonstration of the system used in the oral cavity,a prototype tooth model was used.Based on the principle developed in a previous publication by the author(s),the proposed system has been implemented using the oral cavity(tongue,teeth,and lips)features alone,without the glottis and the larynx.The positions of the sensors in the proposed system were optimized based on articulatory(oral cavity)gestures estimated by simulating the mechanism of human speech.The system has been tested for all English alphabets and several words with sensor-based input along with an experimental demonstration of the developed algorithm,with limit switches,potentiometer,and flex sensors emulating the tongue in an artificial oral cavity.The system produces the sounds of vowels,consonants,and words in English,along with the pronunciation of meanings of their translations in four major Indian languages,all from oral cavity mapping.The experimental setup also caters to gender mapping of voice.The sound produced from the hardware has been validated by a perceptual test to verify the gender and word of the speech sample by listeners,with∼98%and∼95%accuracy,respectively.Such a model may be useful to interpret speech for those who are speech-disabled because of accidents,neuron disorder,spinal cord injury,or larynx disorder.展开更多
A three-dimensional (3-D) physiological articulatory model was developed to account for the biomechanical properties of the speech organs in speech production. Control of the model to investigate the mechanism of sp...A three-dimensional (3-D) physiological articulatory model was developed to account for the biomechanical properties of the speech organs in speech production. Control of the model to investigate the mechanism of speech production requires an efficient control module to estimate muscle activation patterns, which is used to manipulate the 3-D physiological articulatory model, according to the desired articulatory posture. For this purpose, a feedforward control strategy was developed by mapping the articulatory target to the corresponding muscle activation pattern via the intrinsic representation of vowel articulation. In this process, the articulatory postures are first mapped to the corresponding intrinsic representations; then, the articulatory postures are clustered in the intrinsic representations space and a nonlinear function is approximated for each cluster to map the intrinsic representation of vowel articulation to the muscle activation pattern by using general regression neural networks (GRNN). The results show that the feedforward control module is able to manipulate the 3-D physiological articulatory model for vowel production with high accuracy both acoustically and articulatorily.展开更多
This study explored how native speakers utilize intonation to produce French clause-combining complexes with causal conjunctions,particularly investigating how the prosodic realization would be affected by the narrati...This study explored how native speakers utilize intonation to produce French clause-combining complexes with causal conjunctions,particularly investigating how the prosodic realization would be affected by the narrative order of the cause and effect event,which conforms or conflicts with the iconic reasoning order,in a conversation with projected focus.Ten native French speakers were recruited to read aloud 68 question-answer pairs.The critical answer conveys volitional content causality consisting of a prior clause combined with a causal/consequence clause introduced by the conjunction car or donc,forming effect-cause(EC)or cause-effect(CE)order,respectively.It responds to either a why-question or a general question so that the focus position is manipulated.Results of clausal boundary intonation and the prosodic prominence placement were convergent:EC order and focus in the second clause increased uses of continuing boundary intonation and prominence on the second clause as compared with CE order and focus in the prior clause,as both factors showed main effects.Our finding is not supportive to the cognitive account predicting prosodic dissociation for non-iconic order;instead,it may shed light on the critical role of prosody in marking causality by highlighting the influence of contextualization cues.展开更多
Spoken language is marked by para-verbal and non-verbal dimensions,such as silent pause,hesitation,and intonation.Fluency is considered as an essential parameter for evaluating the quality of oral output.Sight transla...Spoken language is marked by para-verbal and non-verbal dimensions,such as silent pause,hesitation,and intonation.Fluency is considered as an essential parameter for evaluating the quality of oral output.Sight translation is a hybrid form of written translation and oral interpreting;however,fluency in sight translation is an underresearched topic.This paper examines the disfluencies in the sight translation of professional and novice translators working from English to Chinese.It adopted a statistical approach to compare the silent pauses and hesitation fillers in the delivery of professional and student participants.According to the independent samples t-test results,the differences were significant in the occurrence and ratio of hesitation between professional and student participants.It can be inferred that professional translators are more apt at coping tactics,such as pausing for several seconds,rather than inserting hesitation fillers unconsciously.Furthermore,the corpus-assisted analysis suggested a higher lexical density among professional participants,followed by the think-aloud approach,which revealed the causes for disfluencies.By drawing upon the Speech Production Theory,the paper found six most influential factors:vocabulary,emotion,syntactic category,speaking habit,lexical ambiguity,and topical difficulty.It is hoped that translation and interpreting studies will not be confined to linguistic dimensions.Paralinguistic signs should have their places in the domain of translatology.Semiotics of translation,which views translation as a pure semiotic act,thereby provides a valued perspective for translation and interpreting studies.展开更多
文摘Language training for nonfluent aphasia (NFA) patients rmay increase their verbal expression of unfamiliarwords. Some therapies aimed at the improvement of cognitive functions can facilitate the recovery of NFA patients' damaged linguistic functions. Some studies have shown that with music cues NFA patients could fluently sing their familiar songs but could not read the lyrics, consistent with studies of proverbs and prayer) Our previous research has shown that highly related voice cues can improve NFA patients' verbal expression.2 These results indicate that the improvement of NFA patients' speech production may benefit from regaining the extraction of phonological encoding that has already been preserved in memory, rather than re-study of the language.
文摘Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain and realizing it in an artificial system. A physiological research study based on electromyographic signals (Honda, 1996) suggested that speech communication in human brain might be based on a topological mapping between speech production and perception, according to an analogous topology between motor and sensory representations. Following this hypothesis, this study first investigated the topologies of the vowel system across the motor, kinematic, and acoustic spaces by means of a model simulation, and then examined the linkage between vowel production and perception in terms of a transformed auditory feedback (TAF) experiment. The model simulation indicated that there exists an invariant mapping from muscle activations (motor space) to articulations (kinematic space) via a coordinate consisting of force-dependent equilibrium positions, and the mapping from the motor space to kinematic space is unique. The motor-kinematic-acoustic deduction in the model simulation showed that the topologies were compatible from one space to another. In the TAF experiment, vowel production exhibited a compensatory response for a perturbation in the feedback sound. This implied that vowel production is controlled in reference to perception monitoring.
文摘The experiment presented in this research is targeting a 'positional' stage of a 'modular' model of speech production originally proposed by Levelt (1989), Bock & Levelt (1994), where selected lemmas are inserted into syntactic frames. Results suggest a difference between L1 and L2 English speakers at the positional stage. While this might suggest that the speech planning process is different in native and non-native speakers, an alternative view is also proposed that the observed differences are the result of differences in the way that linguistic forms are stored, rather than a fundamental difference in the way that speech is planned. This result indicates main verb, copula be & local dependency effect are the three elements that affect the realization of English subject-verb agreement, and helps us locate the phase where L2 subject-verb agreement errors happen.
基金This work is supported by National Natural Science Foundation of China !(69972046)the NSF of Zhejiang Province! (698076)
文摘A method to synthesize formant targeted sounds based on speech production model and Reflection-Type Line Analog (RTLA) articulatory synthesis model is presented. The synthesis model is implemented with scattering process derived from a RTLA of vocal tract system according to the acoustic mechanism of speech production. The vocal-tract area function which controls the synthesis model is derived from the first three formant trajectories by using the inverse solution of speech production. The proposed method not only gives good naturalness and dynamic smoothness, but also is capable to control or modify speech timbres easily and flexibly. Further and mores it needs less number of control parameters and very low update rate of the parameters.
文摘This research studies the features of chest and abdominal breathing in Zhuang language.Two participants were recruited to record 30 news articles of Zhuang language.The chest and abdominal breathing signals as well as speech signal were recorded simultaneously. Programs for breathing analysis have been written to extract parameters such as breathing reset amplitude, time of inhale phase, and slope of exhale phase. The results show that the times of inhale and exhale reset of abdominal breathing are earlier than chest breathing, the breathing reset is related to prosodic boundaries
基金The authors would like to acknowledge theMinistry of Electronics and Informa-tion Technology(MeitY)Government of India for financial support through the scholarship for Palli Padmini,during research work through Visvesvaraya Ph.D.Scheme for Electronics and IT.
文摘The present system experimentally demonstrates a synthesis of syllables and words from tongue manoeuvers in multiple languages,captured by four oral sensors only.For an experimental demonstration of the system used in the oral cavity,a prototype tooth model was used.Based on the principle developed in a previous publication by the author(s),the proposed system has been implemented using the oral cavity(tongue,teeth,and lips)features alone,without the glottis and the larynx.The positions of the sensors in the proposed system were optimized based on articulatory(oral cavity)gestures estimated by simulating the mechanism of human speech.The system has been tested for all English alphabets and several words with sensor-based input along with an experimental demonstration of the developed algorithm,with limit switches,potentiometer,and flex sensors emulating the tongue in an artificial oral cavity.The system produces the sounds of vowels,consonants,and words in English,along with the pronunciation of meanings of their translations in four major Indian languages,all from oral cavity mapping.The experimental setup also caters to gender mapping of voice.The sound produced from the hardware has been validated by a perceptual test to verify the gender and word of the speech sample by listeners,with∼98%and∼95%accuracy,respectively.Such a model may be useful to interpret speech for those who are speech-disabled because of accidents,neuron disorder,spinal cord injury,or larynx disorder.
基金Supported partly by the Promoting Science and Technology by the Japan Ministry of Education,Culture,Sports,Science and Technology and the SCOPE of the Ministry of Internal Affairs and Communications (MIC),Japan (No.071705001)
文摘A three-dimensional (3-D) physiological articulatory model was developed to account for the biomechanical properties of the speech organs in speech production. Control of the model to investigate the mechanism of speech production requires an efficient control module to estimate muscle activation patterns, which is used to manipulate the 3-D physiological articulatory model, according to the desired articulatory posture. For this purpose, a feedforward control strategy was developed by mapping the articulatory target to the corresponding muscle activation pattern via the intrinsic representation of vowel articulation. In this process, the articulatory postures are first mapped to the corresponding intrinsic representations; then, the articulatory postures are clustered in the intrinsic representations space and a nonlinear function is approximated for each cluster to map the intrinsic representation of vowel articulation to the muscle activation pattern by using general regression neural networks (GRNN). The results show that the feedforward control module is able to manipulate the 3-D physiological articulatory model for vowel production with high accuracy both acoustically and articulatorily.
基金supported by CASS Innovation ProgramCASS Innovation Program for young scholars
文摘This study explored how native speakers utilize intonation to produce French clause-combining complexes with causal conjunctions,particularly investigating how the prosodic realization would be affected by the narrative order of the cause and effect event,which conforms or conflicts with the iconic reasoning order,in a conversation with projected focus.Ten native French speakers were recruited to read aloud 68 question-answer pairs.The critical answer conveys volitional content causality consisting of a prior clause combined with a causal/consequence clause introduced by the conjunction car or donc,forming effect-cause(EC)or cause-effect(CE)order,respectively.It responds to either a why-question or a general question so that the focus position is manipulated.Results of clausal boundary intonation and the prosodic prominence placement were convergent:EC order and focus in the second clause increased uses of continuing boundary intonation and prominence on the second clause as compared with CE order and focus in the prior clause,as both factors showed main effects.Our finding is not supportive to the cognitive account predicting prosodic dissociation for non-iconic order;instead,it may shed light on the critical role of prosody in marking causality by highlighting the influence of contextualization cues.
基金This paper is part of the research project“A TAPs-Based Cognitive Approach to Interpreting Studies”(2016SJB740029)funded by Jiangsu Provincial Department of Education.The author would like to acknowledge with gratitude the comments on this paper from the anonymous reviewers.
文摘Spoken language is marked by para-verbal and non-verbal dimensions,such as silent pause,hesitation,and intonation.Fluency is considered as an essential parameter for evaluating the quality of oral output.Sight translation is a hybrid form of written translation and oral interpreting;however,fluency in sight translation is an underresearched topic.This paper examines the disfluencies in the sight translation of professional and novice translators working from English to Chinese.It adopted a statistical approach to compare the silent pauses and hesitation fillers in the delivery of professional and student participants.According to the independent samples t-test results,the differences were significant in the occurrence and ratio of hesitation between professional and student participants.It can be inferred that professional translators are more apt at coping tactics,such as pausing for several seconds,rather than inserting hesitation fillers unconsciously.Furthermore,the corpus-assisted analysis suggested a higher lexical density among professional participants,followed by the think-aloud approach,which revealed the causes for disfluencies.By drawing upon the Speech Production Theory,the paper found six most influential factors:vocabulary,emotion,syntactic category,speaking habit,lexical ambiguity,and topical difficulty.It is hoped that translation and interpreting studies will not be confined to linguistic dimensions.Paralinguistic signs should have their places in the domain of translatology.Semiotics of translation,which views translation as a pure semiotic act,thereby provides a valued perspective for translation and interpreting studies.