This paper mainly talks about the differences of speech communication betweenChina and Britain- In comparison of status, Conditions and actions of speech with culture, we getto know that culture is different in China ...This paper mainly talks about the differences of speech communication betweenChina and Britain- In comparison of status, Conditions and actions of speech with culture, we getto know that culture is different in China and Britain- lf Chinese learners want to learn Englishwelt they must know some thing about its culture. otherwise, we cannt say that thcy haveleamed English well. As to how culture is learned and tapht, it needs us English teachers to havefurther research on culture in the comng days.展开更多
The current study applied classic communication models to investigating 34 laureate impromptu speeches at the 2010 and 2011 "FLTRP (Foreign Language Teaching and Research Press) Cup" English Speaking Contest, one ...The current study applied classic communication models to investigating 34 laureate impromptu speeches at the 2010 and 2011 "FLTRP (Foreign Language Teaching and Research Press) Cup" English Speaking Contest, one of the most influential of its kind in China, to identify the features of public speaking skills of Advanced Chinese EFL learners. The speech scripts and video excerpts from the subsequent manuscript collection with the CD-Rom published by FLTRP were studied. Lucas' framework of speech communication process was borrowed to bridge elements of the communication models and speech delivery process. Three key aspects of the speakers' encoding endeavors were under close examination: the verbal preferences, the non-verbal preferences, and the topic selection for exemplification. It was found that successful speakers demonstrated a clear and strong audience orientation. They chose more first person pronouns, fewer abstract words, a controlled number of dependent clauses, clear transition words, and limited figures of speech. They used very few verbal fillers, slips of tongue, a moderate speech rate and varied tone inflections. The speakers also showed distinctive features in gestures, eye contact and facial expressions. They employed anecdotes which shared a common field of experience. Pedagogical implications on the teaching of public speaking were discussed.展开更多
Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain an...Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain and realizing it in an artificial system. A physiological research study based on electromyographic signals (Honda, 1996) suggested that speech communication in human brain might be based on a topological mapping between speech production and perception, according to an analogous topology between motor and sensory representations. Following this hypothesis, this study first investigated the topologies of the vowel system across the motor, kinematic, and acoustic spaces by means of a model simulation, and then examined the linkage between vowel production and perception in terms of a transformed auditory feedback (TAF) experiment. The model simulation indicated that there exists an invariant mapping from muscle activations (motor space) to articulations (kinematic space) via a coordinate consisting of force-dependent equilibrium positions, and the mapping from the motor space to kinematic space is unique. The motor-kinematic-acoustic deduction in the model simulation showed that the topologies were compatible from one space to another. In the TAF experiment, vowel production exhibited a compensatory response for a perturbation in the feedback sound. This implied that vowel production is controlled in reference to perception monitoring.展开更多
In this work,the primary focus is to identify potential technical risks of Artificial Intel-ligence(AI)-driven operations for the safety monitoring of the air traffic from the perspective of speech communication by st...In this work,the primary focus is to identify potential technical risks of Artificial Intel-ligence(AI)-driven operations for the safety monitoring of the air traffic from the perspective of speech communication by studying the representative case and evaluating user experience.The case study is performed to evaluate the AI-driven techniques and applications using objective metrics,in which several risks and technical facts are obtained to direct future research.Considering the safety–critical specificities of the air traffic control system,a comprehensive subjective evaluation is conducted to collect user experience by a well-designed anonymous questionnaire and a face-to-face interview.In this procedure,the potential risks obtained from the case study are confirmed,and the impacts on human working are considered.Both the case study and the evaluation of user experience provide compatible results and conclusions:(A)the proposed solution is promising to improve the traffic safety and reduce the workload by detecting potential risks in advance;(B)the AI-driven techniques and whole diagram are suggested to be enhanced to eliminate the possible distraction to the attention of air traffic controllers.Finally,a variety of strategies and approaches are discussed to explore their capability to advance the proposed solution to industrial practices.展开更多
Steganography based on bits-modification of speech frames is a kind of commonly used method, which targets at RTP payloads and offers covert communications over voice-over-IP(Vo IP). However, direct modification on fr...Steganography based on bits-modification of speech frames is a kind of commonly used method, which targets at RTP payloads and offers covert communications over voice-over-IP(Vo IP). However, direct modification on frames is often independent of the inherent speech features, which may lead to great degradation of speech quality. A novel frame-bitrate-change based steganography is proposed in this work, which discovers a novel covert channel for Vo IP and introduces less distortion. This method exploits the feature of multi-rate speech codecs that the practical bitrate of speech frame is identified only by speech decoder at receiving end. Based on this characteristic, two steganography strategies called bitrate downgrading(BD) and bitrate switching(BS)are provided. The first strategy substitutes high bit-rate speech frames with lower ones to embed secret message, which introduces very low distortion in practice, and much less than other bits-modification based methods with the same embedding capacity. The second one encodes secret message bits into different types of speech frames, which is an alternative choice for supplement. The two strategies are implemented and tested on our covert communication system Steg Vo IP. The experiment results show that our proposed method is effective and fulfills the real-time requirement of Vo IP communication.展开更多
At medium or long distance (〉 10 kin) underwater acoustic speech communication, information transfer rate is constrained by the complicated, time varying channel and limited bandwidth. The bit rate of speech coding...At medium or long distance (〉 10 kin) underwater acoustic speech communication, information transfer rate is constrained by the complicated, time varying channel and limited bandwidth. The bit rate of speech coding is required to be as low as possible. The time delay of underwater acoustic wave propagation can be used for low bit rate speech coding. After investigating the Mixed Excitation Linear Prediction (MELP) standard and taking account of the auditory perceptual features, a variable and adjustable bit rate speech codec algorithm has been proposed, whose average bit rate is about 600 bps. The average Perceptual Evaluation of Speech Quality Mean Opinion Score (PESQ MOS) of synthesized speeches is about 2.8. It has been proved by the computer simulation and sea trial that the performance of the proposed algorithm is well and robust when bit error rate is no more than 10-3. The synthesized speech is vivid and intelligible, and keeps main individual characteristics of speaker.展开更多
The 4th National Conference on Speech,Image,Communication and Signal Pro-cessing,which was sponsored by the Institute of Speech,Hearing,and Music Acoustics,Acoustical Society of China and the Institute of Signal Proce...The 4th National Conference on Speech,Image,Communication and Signal Pro-cessing,which was sponsored by the Institute of Speech,Hearing,and Music Acoustics,Acoustical Society of China and the Institute of Signal Processing,Electronic Society ofChina,was held,25—27 October,1989,at Beijing Institute of Post and Telecommun-ication.The conference drew a registration of 150 from different places in the country,which made it the largest conference in the last eight years.The president of Institute of Speech,Hearing,and Music Acoustics,ASC,professorZHANG Jialu made a openning speech at the openning session,and the honorary presi-dent of Acoustical Society of China,professor MAA Dah-You and the president of展开更多
According to the principals of the Course Standard of English for Ordinary Senior Schools, published by China Education Ministry in 2001, the general goal of English curriculum on the basic education stage is to foste...According to the principals of the Course Standard of English for Ordinary Senior Schools, published by China Education Ministry in 2001, the general goal of English curriculum on the basic education stage is to foster the students' cross-cultural awareness, which is to develop their cross-cultural communicative competence. To achieve this goal, most of all, teachers of English at high school must have a good command of cross-cultural pragmatic competence. By looking through papers in various periodicals and the Internet, the writer finds that most of the studies have coped with pragmatic failures of ESL students but few of ESL teachers'. Therefore, this thesis studies the present situation --- how much do schoolteachers of English know about pragmatics? What about their cross-cultural communicative competence? Can they communicate with the native speakers fluently and appropriately? What are the most serious problems among them in pragmatics? What can be done to solve the problems? With all the questions, this research has collected considerable firsthand data and information. Only after the basic situations are known overall, can actions be taken to solve the problems existing in everyday ESL teaching. So this research is necessary and urgent for the present basic English pedagogy. However, since the present writer's ability and research conditions are limited, it is not possible to make a wider investigation in many places. Therefore, the district --- Bameng League, Inner Mongolia, China, is selected as the basic range of the survey, the situation of which is believed to be able to represent the basic educational ones in the poor and remote districts in the north and the west of China where education is more backward. The questionnaire consists of two parts: a table of individual information and a test paper. The former investigates informants' basic information, including the items of name, age, sex, teaching age, title of the technical post, degree, academic career, college or university that they graduated from, working unit, grades for teaching and the others. The latter is the test paper cited from He Ziran and Yan Zhuang (1986), that is, Chinese Students'Pragmatic Failure in English Communication --- Survey of the Pragmatic Difference between Chinese and English. And it contains 48 items, with items of pragmalinguistic and socio-pragmatic competence mixedly arranged. 120 of the questionnaire copies were handed out and 86 are collected, but only 64 are collected. The survey covers over 10 high schools in all the 7 banners of Bameng League, including two autonomous region-class key schools and five other league-class key schools. After the investigation, the present writer carefully corrects, marks, arranges, summarizes and analyzes the questionnaire, and obtains plenty of firsthand data and information. Then some conclusions have been drawn based on the data analysis. The result of the survey shows that linguistic competence is not equal to pragmatic competence, that ESL high school English teachers may use English rather fluently, without many mistakes in vocabulary and grammar,but it does not necessarily mean that they can use the target language appropriately to communicate with native speakers without pragmatic failures, and that the informants' knowledge of pragmatics and their pragmatic competence are rather poor. As a result, improving their knowledge and competence in pragmatics becomes a very necessary and urgent task to develop the quality of ELT in basic educational stages. In order to solve the above problems, the writer makes some suggestions as follows: (1) Pragmatics should be required as an essential course for the English majors in colleges and universities, especially in normal colleges and universities; (2) In-service high school English teachers must study courses of pragmatics and cross-cultural communication; (3) In some poor and remote districts, if educational conditions don't permit, textbooks of pragmatics and cross-cultural communication must be handed out to the high school English teachers who can study by themselves, then tests are regularly held for them and certificates will be given to those who have passed the tests; (4) In some developed districts, if conditions permit, English teachers can be sent abroad to English-speaking countries for further education in order to have a good command of the English language in the authentic contexts; (5) Modern multiple medias, such as VCD or DVD videos, TV or radio programs, the Internet, should be widely and frequently applied to training teachers; (6) More native language teachers and experts from English-speaking countries must be invited to high schools to train teachers of English.展开更多
Voice over IP ( VoIP ) and Voice over ATM ( VoA ) are two research hotspots in the fields of communication technology. One of the main problems degrading the QoS in VoIP and VoA is the longer tim...Voice over IP ( VoIP ) and Voice over ATM ( VoA ) are two research hotspots in the fields of communication technology. One of the main problems degrading the QoS in VoIP and VoA is the longer time delay when speech packets are transmitted from a speaker to a listener. This paper discusses in detail the composition of time delay and the influence of different parts of delay on speech quality under IP and ATM network environments. The author puts forward a new method to convert usual CBR code streams into VBR streams for the purpose of reducing time delay from one terminal to another, which can avoid the dilemma between compressing the bit rates of speech sources and reducing assembly delay of speech packets.展开更多
This paper presents the real-time implementation of 6.75kb/s speech codec for the GSM half-rate digital cellular system based on CELP[1]. Logarithmic Area Ratio (LAN).[2] quanrizarion for short term Parameters and e...This paper presents the real-time implementation of 6.75kb/s speech codec for the GSM half-rate digital cellular system based on CELP[1]. Logarithmic Area Ratio (LAN).[2] quanrizarion for short term Parameters and eeeicient adaptive codebook search are used. An overlapping center-clipping codebook and the fonnufor for fast searching are proposed. The MOS of the synthesized speech is over 3.5.展开更多
文摘This paper mainly talks about the differences of speech communication betweenChina and Britain- In comparison of status, Conditions and actions of speech with culture, we getto know that culture is different in China and Britain- lf Chinese learners want to learn Englishwelt they must know some thing about its culture. otherwise, we cannt say that thcy haveleamed English well. As to how culture is learned and tapht, it needs us English teachers to havefurther research on culture in the comng days.
基金supported by the Fundamental Research Funds for the Central Universities(2011WC033)the High Level Fundamental Research Funds for the Development of Foreign Language Discipline of HUST(Huazhong University of Science and Technology)
文摘The current study applied classic communication models to investigating 34 laureate impromptu speeches at the 2010 and 2011 "FLTRP (Foreign Language Teaching and Research Press) Cup" English Speaking Contest, one of the most influential of its kind in China, to identify the features of public speaking skills of Advanced Chinese EFL learners. The speech scripts and video excerpts from the subsequent manuscript collection with the CD-Rom published by FLTRP were studied. Lucas' framework of speech communication process was borrowed to bridge elements of the communication models and speech delivery process. Three key aspects of the speakers' encoding endeavors were under close examination: the verbal preferences, the non-verbal preferences, and the topic selection for exemplification. It was found that successful speakers demonstrated a clear and strong audience orientation. They chose more first person pronouns, fewer abstract words, a controlled number of dependent clauses, clear transition words, and limited figures of speech. They used very few verbal fillers, slips of tongue, a moderate speech rate and varied tone inflections. The speakers also showed distinctive features in gestures, eye contact and facial expressions. They employed anecdotes which shared a common field of experience. Pedagogical implications on the teaching of public speaking were discussed.
文摘Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain and realizing it in an artificial system. A physiological research study based on electromyographic signals (Honda, 1996) suggested that speech communication in human brain might be based on a topological mapping between speech production and perception, according to an analogous topology between motor and sensory representations. Following this hypothesis, this study first investigated the topologies of the vowel system across the motor, kinematic, and acoustic spaces by means of a model simulation, and then examined the linkage between vowel production and perception in terms of a transformed auditory feedback (TAF) experiment. The model simulation indicated that there exists an invariant mapping from muscle activations (motor space) to articulations (kinematic space) via a coordinate consisting of force-dependent equilibrium positions, and the mapping from the motor space to kinematic space is unique. The motor-kinematic-acoustic deduction in the model simulation showed that the topologies were compatible from one space to another. In the TAF experiment, vowel production exhibited a compensatory response for a perturbation in the feedback sound. This implied that vowel production is controlled in reference to perception monitoring.
基金supported by the National Natural Science Foundation of China(Nos.62001315,71971150,U20A20161)the Open Fund of Key Laboratory of Flight Techniques and Flight Safety,Civil Aviation Administration of China(No.FZ2021KF04)Fundamental Research Funds for the Central Universities of China(No.2021SCU12050).
文摘In this work,the primary focus is to identify potential technical risks of Artificial Intel-ligence(AI)-driven operations for the safety monitoring of the air traffic from the perspective of speech communication by studying the representative case and evaluating user experience.The case study is performed to evaluate the AI-driven techniques and applications using objective metrics,in which several risks and technical facts are obtained to direct future research.Considering the safety–critical specificities of the air traffic control system,a comprehensive subjective evaluation is conducted to collect user experience by a well-designed anonymous questionnaire and a face-to-face interview.In this procedure,the potential risks obtained from the case study are confirmed,and the impacts on human working are considered.Both the case study and the evaluation of user experience provide compatible results and conclusions:(A)the proposed solution is promising to improve the traffic safety and reduce the workload by detecting potential risks in advance;(B)the AI-driven techniques and whole diagram are suggested to be enhanced to eliminate the possible distraction to the attention of air traffic controllers.Finally,a variety of strategies and approaches are discussed to explore their capability to advance the proposed solution to industrial practices.
基金Project(2011CB302305)supported by National Basic Research Program(973 Program)of ChinaProjects(61232004,61302094)supported by National Natural Science Foundation of China+2 种基金Project(ZQN-PY115)supported by Promotion Program for Young and Middle-aged Teacher in Science and Technology Research of Huaqiao University,ChinaProject(JA13012)supported by Education Science Research Program for Young and Middle-aged Teacher of Fujian Province of ChinaProject(2014J01238)supported by Natural Science Foundation of Fujian Province of China
文摘Steganography based on bits-modification of speech frames is a kind of commonly used method, which targets at RTP payloads and offers covert communications over voice-over-IP(Vo IP). However, direct modification on frames is often independent of the inherent speech features, which may lead to great degradation of speech quality. A novel frame-bitrate-change based steganography is proposed in this work, which discovers a novel covert channel for Vo IP and introduces less distortion. This method exploits the feature of multi-rate speech codecs that the practical bitrate of speech frame is identified only by speech decoder at receiving end. Based on this characteristic, two steganography strategies called bitrate downgrading(BD) and bitrate switching(BS)are provided. The first strategy substitutes high bit-rate speech frames with lower ones to embed secret message, which introduces very low distortion in practice, and much less than other bits-modification based methods with the same embedding capacity. The second one encodes secret message bits into different types of speech frames, which is an alternative choice for supplement. The two strategies are implemented and tested on our covert communication system Steg Vo IP. The experiment results show that our proposed method is effective and fulfills the real-time requirement of Vo IP communication.
基金supported by the National Natural Science Foundation of China(61102152)
文摘At medium or long distance (〉 10 kin) underwater acoustic speech communication, information transfer rate is constrained by the complicated, time varying channel and limited bandwidth. The bit rate of speech coding is required to be as low as possible. The time delay of underwater acoustic wave propagation can be used for low bit rate speech coding. After investigating the Mixed Excitation Linear Prediction (MELP) standard and taking account of the auditory perceptual features, a variable and adjustable bit rate speech codec algorithm has been proposed, whose average bit rate is about 600 bps. The average Perceptual Evaluation of Speech Quality Mean Opinion Score (PESQ MOS) of synthesized speeches is about 2.8. It has been proved by the computer simulation and sea trial that the performance of the proposed algorithm is well and robust when bit error rate is no more than 10-3. The synthesized speech is vivid and intelligible, and keeps main individual characteristics of speaker.
文摘The 4th National Conference on Speech,Image,Communication and Signal Pro-cessing,which was sponsored by the Institute of Speech,Hearing,and Music Acoustics,Acoustical Society of China and the Institute of Signal Processing,Electronic Society ofChina,was held,25—27 October,1989,at Beijing Institute of Post and Telecommun-ication.The conference drew a registration of 150 from different places in the country,which made it the largest conference in the last eight years.The president of Institute of Speech,Hearing,and Music Acoustics,ASC,professorZHANG Jialu made a openning speech at the openning session,and the honorary presi-dent of Acoustical Society of China,professor MAA Dah-You and the president of
文摘According to the principals of the Course Standard of English for Ordinary Senior Schools, published by China Education Ministry in 2001, the general goal of English curriculum on the basic education stage is to foster the students' cross-cultural awareness, which is to develop their cross-cultural communicative competence. To achieve this goal, most of all, teachers of English at high school must have a good command of cross-cultural pragmatic competence. By looking through papers in various periodicals and the Internet, the writer finds that most of the studies have coped with pragmatic failures of ESL students but few of ESL teachers'. Therefore, this thesis studies the present situation --- how much do schoolteachers of English know about pragmatics? What about their cross-cultural communicative competence? Can they communicate with the native speakers fluently and appropriately? What are the most serious problems among them in pragmatics? What can be done to solve the problems? With all the questions, this research has collected considerable firsthand data and information. Only after the basic situations are known overall, can actions be taken to solve the problems existing in everyday ESL teaching. So this research is necessary and urgent for the present basic English pedagogy. However, since the present writer's ability and research conditions are limited, it is not possible to make a wider investigation in many places. Therefore, the district --- Bameng League, Inner Mongolia, China, is selected as the basic range of the survey, the situation of which is believed to be able to represent the basic educational ones in the poor and remote districts in the north and the west of China where education is more backward. The questionnaire consists of two parts: a table of individual information and a test paper. The former investigates informants' basic information, including the items of name, age, sex, teaching age, title of the technical post, degree, academic career, college or university that they graduated from, working unit, grades for teaching and the others. The latter is the test paper cited from He Ziran and Yan Zhuang (1986), that is, Chinese Students'Pragmatic Failure in English Communication --- Survey of the Pragmatic Difference between Chinese and English. And it contains 48 items, with items of pragmalinguistic and socio-pragmatic competence mixedly arranged. 120 of the questionnaire copies were handed out and 86 are collected, but only 64 are collected. The survey covers over 10 high schools in all the 7 banners of Bameng League, including two autonomous region-class key schools and five other league-class key schools. After the investigation, the present writer carefully corrects, marks, arranges, summarizes and analyzes the questionnaire, and obtains plenty of firsthand data and information. Then some conclusions have been drawn based on the data analysis. The result of the survey shows that linguistic competence is not equal to pragmatic competence, that ESL high school English teachers may use English rather fluently, without many mistakes in vocabulary and grammar,but it does not necessarily mean that they can use the target language appropriately to communicate with native speakers without pragmatic failures, and that the informants' knowledge of pragmatics and their pragmatic competence are rather poor. As a result, improving their knowledge and competence in pragmatics becomes a very necessary and urgent task to develop the quality of ELT in basic educational stages. In order to solve the above problems, the writer makes some suggestions as follows: (1) Pragmatics should be required as an essential course for the English majors in colleges and universities, especially in normal colleges and universities; (2) In-service high school English teachers must study courses of pragmatics and cross-cultural communication; (3) In some poor and remote districts, if educational conditions don't permit, textbooks of pragmatics and cross-cultural communication must be handed out to the high school English teachers who can study by themselves, then tests are regularly held for them and certificates will be given to those who have passed the tests; (4) In some developed districts, if conditions permit, English teachers can be sent abroad to English-speaking countries for further education in order to have a good command of the English language in the authentic contexts; (5) Modern multiple medias, such as VCD or DVD videos, TV or radio programs, the Internet, should be widely and frequently applied to training teachers; (6) More native language teachers and experts from English-speaking countries must be invited to high schools to train teachers of English.
文摘Voice over IP ( VoIP ) and Voice over ATM ( VoA ) are two research hotspots in the fields of communication technology. One of the main problems degrading the QoS in VoIP and VoA is the longer time delay when speech packets are transmitted from a speaker to a listener. This paper discusses in detail the composition of time delay and the influence of different parts of delay on speech quality under IP and ATM network environments. The author puts forward a new method to convert usual CBR code streams into VBR streams for the purpose of reducing time delay from one terminal to another, which can avoid the dilemma between compressing the bit rates of speech sources and reducing assembly delay of speech packets.
基金the Ministry of Posts and Telecommunications of China.
文摘This paper presents the real-time implementation of 6.75kb/s speech codec for the GSM half-rate digital cellular system based on CELP[1]. Logarithmic Area Ratio (LAN).[2] quanrizarion for short term Parameters and eeeicient adaptive codebook search are used. An overlapping center-clipping codebook and the fonnufor for fast searching are proposed. The MOS of the synthesized speech is over 3.5.