Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is ext...Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is extremely high,so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system.The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy.First,we use the information gain and Fisher Score to sort the features extracted from signals.Then,we employ a multi-objective ranking method to evaluate these features and assign different importance to them.Features with high rankings have a large probability of being selected.Finally,we propose a repair strategy to address the problem of duplicate solutions in multi-objective feature selection,which can improve the diversity of solutions and avoid falling into local traps.Using random forest and K-nearest neighbor classifiers,four English speech emotion datasets are employed to test the proposed algorithm(MBEO)as well as other multi-objective emotion identification techniques.The results illustrate that it performs well in inverted generational distance,hypervolume,Pareto solutions,and execution time,and MBEO is appropriate for high-dimensional English SER.展开更多
Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotiona...Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotional states of speakers holds significant importance in a range of real-time applications,including but not limited to virtual reality,human-robot interaction,emergency centers,and human behavior assessment.Accurately identifying emotions in the SER process relies on extracting relevant information from audio inputs.Previous studies on SER have predominantly utilized short-time characteristics such as Mel Frequency Cepstral Coefficients(MFCCs)due to their ability to capture the periodic nature of audio signals effectively.Although these traits may improve their ability to perceive and interpret emotional depictions appropriately,MFCCS has some limitations.So this study aims to tackle the aforementioned issue by systematically picking multiple audio cues,enhancing the classifier model’s efficacy in accurately discerning human emotions.The utilized dataset is taken from the EMO-DB database,preprocessing input speech is done using a 2D Convolution Neural Network(CNN)involves applying convolutional operations to spectrograms as they afford a visual representation of the way the audio signal frequency content changes over time.The next step is the spectrogram data normalization which is crucial for Neural Network(NN)training as it aids in faster convergence.Then the five auditory features MFCCs,Chroma,Mel-Spectrogram,Contrast,and Tonnetz are extracted from the spectrogram sequentially.The attitude of feature selection is to retain only dominant features by excluding the irrelevant ones.In this paper,the Sequential Forward Selection(SFS)and Sequential Backward Selection(SBS)techniques were employed for multiple audio cues features selection.Finally,the feature sets composed from the hybrid feature extraction methods are fed into the deep Bidirectional Long Short Term Memory(Bi-LSTM)network to discern emotions.Since the deep Bi-LSTM can hierarchically learn complex features and increases model capacity by achieving more robust temporal modeling,it is more effective than a shallow Bi-LSTM in capturing the intricate tones of emotional content existent in speech signals.The effectiveness and resilience of the proposed SER model were evaluated by experiments,comparing it to state-of-the-art SER techniques.The results indicated that the model achieved accuracy rates of 90.92%,93%,and 92%over the Ryerson Audio-Visual Database of Emotional Speech and Song(RAVDESS),Berlin Database of Emotional Speech(EMO-DB),and The Interactive Emotional Dyadic Motion Capture(IEMOCAP)datasets,respectively.These findings signify a prominent enhancement in the ability to emotional depictions identification in speech,showcasing the potential of the proposed model in advancing the SER field.展开更多
In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a p...In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues, we propose a speech-text multimodal dual-tower architecture for speech recognition. It employs cross-modal interactions to achieve close semantic alignment during the encoding stage and strengthen its capabilities in modeling auditory long-distance context dependencies. In addition, a two-stage training strategy is elaborately devised to derive semantics-aware acoustic representations effectively. The first stage focuses on pre-training the speech-text multimodal encoding module to enhance inter-modal semantic alignment and aural long-distance context dependencies. The second stage fine-tunes the entire network to bridge the input modality variation gap between the training and inference phases and boost generalization performance. Extensive experiments demonstrate the effectiveness of the proposed speech-text multimodal speech recognition method on the ATCC and AISHELL-1 datasets. It reduces the character error rate to 6.54% and 8.73%, respectively, and exhibits substantial performance gains of 28.76% and 23.82% compared with the best baseline model. The case studies indicate that the obtained semantics-aware acoustic representations aid in accurately recognizing terms with similar pronunciations but distinctive semantics. The research provides a novel modeling paradigm for semantics-aware speech recognition in air traffic control communications, which could contribute to the advancement of intelligent and efficient aviation safety management.展开更多
The teaching of English speeches in universities aims to enhance oral communication ability,improve English communication skills,and expand English knowledge,occupying a core position in English teaching in universiti...The teaching of English speeches in universities aims to enhance oral communication ability,improve English communication skills,and expand English knowledge,occupying a core position in English teaching in universities.This article takes the theory of second language acquisition as the background,analyzes the important role and value of this theory in English speech teaching in universities,and explores how to apply the theory of second language acquisition in English speech teaching in universities.It aims to strengthen the cultivation of English skilled talents and provide a brief reference for improving English speech teaching in universities.展开更多
The theory of indirect speech acts proposed by John Searle is a problematic issue in speech act theory. The theory is subject to various criticisms.This essay reviews various arguments and the significant problems wit...The theory of indirect speech acts proposed by John Searle is a problematic issue in speech act theory. The theory is subject to various criticisms.This essay reviews various arguments and the significant problems with reference to the indirect speech acts. The review includes some important concepts of speech act theory which are related to indirect speech acts; inference theory and idiom theory which underline indirect speech acts and their major problems in accounting for indirect speech acts.展开更多
According to J.L.Austin and J.R.Searle Speech Act Theory,many actions can actually be performed with words that is to say an important part of the meaning of an utterance is what that utterance does.Language is not on...According to J.L.Austin and J.R.Searle Speech Act Theory,many actions can actually be performed with words that is to say an important part of the meaning of an utterance is what that utterance does.Language is not only used to inform or describe things,it is often used to "do things".The utterances are of three kinds of acts:Locutionary Act,Illocutionary Act,and Perlocutionary Act which are performed simultaneously.Through the study,we see that speech can be far beyond the level of language system.展开更多
Pragmatics is a relatively new subject compared with other linguistic studies.In this paper,we will review the definition of pragmatics and its scope.Then a detailed review of speech acts will be shown in the second p...Pragmatics is a relatively new subject compared with other linguistic studies.In this paper,we will review the definition of pragmatics and its scope.Then a detailed review of speech acts will be shown in the second part of the paper.It is mainly about many aspects of speech acts,such as its definition,classification,felicity conditions and so on.The last part of the paper is the prospects of application of pragmatics.展开更多
The current article mainly focuses on identifying pragmatic skills and speech acts,along with illustrating pragmatic failures in sec ond language learners.On the other hand,a brief discussion of the necessity of teach...The current article mainly focuses on identifying pragmatic skills and speech acts,along with illustrating pragmatic failures in sec ond language learners.On the other hand,a brief discussion of the necessity of teaching pragmatic skills and speech acts is provided.展开更多
Speech acts theory has been widely studied in linguistics, but applied into analyzing literature works has been very few. This paper interprets Mark Twain's short novel "Is He Living or Is He Dead" from ...Speech acts theory has been widely studied in linguistics, but applied into analyzing literature works has been very few. This paper interprets Mark Twain's short novel "Is He Living or Is He Dead" from the perspective of the speech acts theory of pragmatics.展开更多
Indirect speech acts are frequently used in verbal communication, the interpretation of them is ofgreat importance in order to meet the demands of the development of students’ communicative competence. Thispaper, the...Indirect speech acts are frequently used in verbal communication, the interpretation of them is ofgreat importance in order to meet the demands of the development of students’ communicative competence. Thispaper, therefore, intends to present Searle’ s indirect speech acts and explore the way how indirect speech acts areinterpreted in accordance ’with two influential theories. It consists of four parts. Part one gives a generalintroduction to the notion of speech acts theory. Part two makes an elaboration upon the conception of indirect speechact theory proposed by Searle and his supplement and development of illocutionary acts. Part three deals with theinterpretation of indirect speech acts. Part four draws implication from the previous study and also serves as theconclusion of the dissertation .展开更多
Indirect speech act theory is a part of pragmatics. In indirect speech acts the speaker often says one thing to mean something else. It is essential for the learners of language to know not only literal meaning of a s...Indirect speech act theory is a part of pragmatics. In indirect speech acts the speaker often says one thing to mean something else. It is essential for the learners of language to know not only literal meaning of a sentence, but also its illocutionary act. So people can efficiently communicate with each other.This paper uses some examples to illustrate some functions and usage of indirect speech acts.展开更多
One of the most compelling notions in pragmatics is Speech Act.Among the subacts, the Perlocutionary Act tells us something about people’ s motivation for using a particular speech act.This paper justifies an interac...One of the most compelling notions in pragmatics is Speech Act.Among the subacts, the Perlocutionary Act tells us something about people’ s motivation for using a particular speech act.This paper justifies an interactive approach to speech act with perlocution as the pivot.Then it attempts to explore the stages of the interactive communication process in this new approach.Typical real -life examples are provided respectively in the course of analysis.展开更多
Day by day,biometric-based systems play a vital role in our daily lives.This paper proposed an intelligent assistant intended to identify emotions via voice message.A biometric system has been developed to detect huma...Day by day,biometric-based systems play a vital role in our daily lives.This paper proposed an intelligent assistant intended to identify emotions via voice message.A biometric system has been developed to detect human emotions based on voice recognition and control a few electronic peripherals for alert actions.This proposed smart assistant aims to provide a support to the people through buzzer and light emitting diodes(LED)alert signals and it also keep track of the places like households,hospitals and remote areas,etc.The proposed approach is able to detect seven emotions:worry,surprise,neutral,sadness,happiness,hate and love.The key elements for the implementation of speech emotion recognition are voice processing,and once the emotion is recognized,the machine interface automatically detects the actions by buzzer and LED.The proposed system is trained and tested on various benchmark datasets,i.e.,Ryerson Audio-Visual Database of Emotional Speech and Song(RAVDESS)database,Acoustic-Phonetic Continuous Speech Corpus(TIMIT)database,Emotional Speech database(Emo-DB)database and evaluated based on various parameters,i.e.,accuracy,error rate,and time.While comparing with existing technologies,the proposed algorithm gave a better error rate and less time.Error rate and time is decreased by 19.79%,5.13 s.for the RAVDEES dataset,15.77%,0.01 s for the Emo-DB dataset and 14.88%,3.62 for the TIMIT database.The proposed model shows better accuracy of 81.02%for the RAVDEES dataset,84.23%for the TIMIT dataset and 85.12%for the Emo-DB dataset compared to Gaussian Mixture Modeling(GMM)and Support Vector Machine(SVM)Model.展开更多
In this study,an evaluation of the presentation of speech acts in six oral English textbooks is conducted from both qualitative and quantitative perspectives to see how speech acts are presented and whether enough and...In this study,an evaluation of the presentation of speech acts in six oral English textbooks is conducted from both qualitative and quantitative perspectives to see how speech acts are presented and whether enough and explicit meta-pragmatic and contextual information are provided.Results show that 1)there is a paucity of speech acts and the average percentage of the six textbooks including speech acts is only 28.3%.And some speech acts like‘threatening’,‘warning’,‘declaring’,‘welcoming’are not presented at all.2)Meta-pragmatic and contextual information is too general and far from enough.Among all the six textbooks,from Book 1 to Book 5,contextual information is deduced by learners through reading conversations.Only in Book 6,a contextual description is provided before the conversation begins.Contextual information such as the age,gender and social status of Speaker and Hearer is never presented.Contextual information like the relationship between Speaker and Hearer and the place where the conversation happens is inferred from reading the conversations.Meta-pragmatic information like the degree of formality,politeness strategy,indirect speech act strategy,and social norms are not at all involved.Only in Book 1,a cultural tip is provided.Since oral English textbooks are one of the main sources for Chinese EFL learners to enhance their pragmatic competence,it is much expected that they should present a wide variety of popularly-used speech acts with rich contextual information as appropriate language input.展开更多
In cross-cultural communication, miscommunication or misunderstanding is a common phenomenon because peoplefrom different backgrounds have different rules of speech acts and cultural conventions. In order to overcome ...In cross-cultural communication, miscommunication or misunderstanding is a common phenomenon because peoplefrom different backgrounds have different rules of speech acts and cultural conventions. In order to overcome the difficulties of mis-communication and improve effective cross-cultural communication, more research is needed to study the speech acts from cultureto culture. In this paper, the author mainly studies three patterns of speech acts studied by previous researchers, including apology,greeting, and compliment, discusses what previous researchers studied on the three speech acts in cross-cultural communication,and then analyzes them with real examples in cross-cultural communication.展开更多
Speech emotion recognition,as an important component of humancomputer interaction technology,has received increasing attention.Recent studies have treated emotion recognition of speech signals as a multimodal task,due...Speech emotion recognition,as an important component of humancomputer interaction technology,has received increasing attention.Recent studies have treated emotion recognition of speech signals as a multimodal task,due to its inclusion of the semantic features of two different modalities,i.e.,audio and text.However,existing methods often fail in effectively represent features and capture correlations.This paper presents a multi-level circulant cross-modal Transformer(MLCCT)formultimodal speech emotion recognition.The proposed model can be divided into three steps,feature extraction,interaction and fusion.Self-supervised embedding models are introduced for feature extraction,which give a more powerful representation of the original data than those using spectrograms or audio features such as Mel-frequency cepstral coefficients(MFCCs)and low-level descriptors(LLDs).In particular,MLCCT contains two types of feature interaction processes,where a bidirectional Long Short-term Memory(Bi-LSTM)with circulant interaction mechanism is proposed for low-level features,while a two-stream residual cross-modal Transformer block is appliedwhen high-level features are involved.Finally,we choose self-attention blocks for fusion and a fully connected layer to make predictions.To evaluate the performance of our proposed model,comprehensive experiments are conducted on three widely used benchmark datasets including IEMOCAP,MELD and CMU-MOSEI.The competitive results verify the effectiveness of our approach.展开更多
The primary English teacher's speech acts have major impact on foreign language teaching and learning in primaryschool.Application of teacher,s speech acts in the classroom is actually a kind of selective process....The primary English teacher's speech acts have major impact on foreign language teaching and learning in primaryschool.Application of teacher,s speech acts in the classroom is actually a kind of selective process.From the perspective of SpeechAct Theory,primary English teachers can optimize their speech acts with the strategies of activating the greetings with proper con-text information,standardizing teacher talk,choosing suitable questions,providing appropriate feedback for pupils’classroom per-formances in order to improve the effectiveness of primary teachers,classroom speech acts.展开更多
Refusals are frequently performed in our daily lives.Based on the speech act theory of Austin and Searle,with the theoretical frame of the politeness theory put forward by Brown and Levinson,It presents a comparative ...Refusals are frequently performed in our daily lives.Based on the speech act theory of Austin and Searle,with the theoretical frame of the politeness theory put forward by Brown and Levinson,It presents a comparative study of speech acts of refusal in Chinese and American English.The results show that refusals vary in directness with situations and cultures;On the one hand,both languages employ the three directness types,namely the direct refusal speech act,ability of negation and indirect refusal speech act,and prefer indirect refusals.The situational variability of directness in both languages follows a similar trend.On the other hand,Americans are more direct than Chinese.Furthermore,Chinese shows the lower degree of situational variation in the use of the three directness types.People' s choices of refusal strategies are influenced by social power and social distance.From all these evidence,we maintain that the cross-linguistic differences are due to basic differences in cultural values.展开更多
The speech act of complaint is an important research subject of pragmatics,which is worthy of research among speech acts.With the development of research into speech acts,some scholars have performed investigations of...The speech act of complaint is an important research subject of pragmatics,which is worthy of research among speech acts.With the development of research into speech acts,some scholars have performed investigations of complaints,but they have done little work on Chinese language complaints.Therefore,it is necessary to make a further study on complaint as a speech act in Chinese.This thesis is based on speech act theory and the politeness principle as an empirical study of the speech act of com plaint in Chinese.It aims to provide a more complete and comprehensive result of participant production of the speech act of complaint.展开更多
文摘Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is extremely high,so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system.The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy.First,we use the information gain and Fisher Score to sort the features extracted from signals.Then,we employ a multi-objective ranking method to evaluate these features and assign different importance to them.Features with high rankings have a large probability of being selected.Finally,we propose a repair strategy to address the problem of duplicate solutions in multi-objective feature selection,which can improve the diversity of solutions and avoid falling into local traps.Using random forest and K-nearest neighbor classifiers,four English speech emotion datasets are employed to test the proposed algorithm(MBEO)as well as other multi-objective emotion identification techniques.The results illustrate that it performs well in inverted generational distance,hypervolume,Pareto solutions,and execution time,and MBEO is appropriate for high-dimensional English SER.
文摘Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotional states of speakers holds significant importance in a range of real-time applications,including but not limited to virtual reality,human-robot interaction,emergency centers,and human behavior assessment.Accurately identifying emotions in the SER process relies on extracting relevant information from audio inputs.Previous studies on SER have predominantly utilized short-time characteristics such as Mel Frequency Cepstral Coefficients(MFCCs)due to their ability to capture the periodic nature of audio signals effectively.Although these traits may improve their ability to perceive and interpret emotional depictions appropriately,MFCCS has some limitations.So this study aims to tackle the aforementioned issue by systematically picking multiple audio cues,enhancing the classifier model’s efficacy in accurately discerning human emotions.The utilized dataset is taken from the EMO-DB database,preprocessing input speech is done using a 2D Convolution Neural Network(CNN)involves applying convolutional operations to spectrograms as they afford a visual representation of the way the audio signal frequency content changes over time.The next step is the spectrogram data normalization which is crucial for Neural Network(NN)training as it aids in faster convergence.Then the five auditory features MFCCs,Chroma,Mel-Spectrogram,Contrast,and Tonnetz are extracted from the spectrogram sequentially.The attitude of feature selection is to retain only dominant features by excluding the irrelevant ones.In this paper,the Sequential Forward Selection(SFS)and Sequential Backward Selection(SBS)techniques were employed for multiple audio cues features selection.Finally,the feature sets composed from the hybrid feature extraction methods are fed into the deep Bidirectional Long Short Term Memory(Bi-LSTM)network to discern emotions.Since the deep Bi-LSTM can hierarchically learn complex features and increases model capacity by achieving more robust temporal modeling,it is more effective than a shallow Bi-LSTM in capturing the intricate tones of emotional content existent in speech signals.The effectiveness and resilience of the proposed SER model were evaluated by experiments,comparing it to state-of-the-art SER techniques.The results indicated that the model achieved accuracy rates of 90.92%,93%,and 92%over the Ryerson Audio-Visual Database of Emotional Speech and Song(RAVDESS),Berlin Database of Emotional Speech(EMO-DB),and The Interactive Emotional Dyadic Motion Capture(IEMOCAP)datasets,respectively.These findings signify a prominent enhancement in the ability to emotional depictions identification in speech,showcasing the potential of the proposed model in advancing the SER field.
基金This research was funded by Shenzhen Science and Technology Program(Grant No.RCBS20221008093121051)the General Higher Education Project of Guangdong Provincial Education Department(Grant No.2020ZDZX3085)+1 种基金China Postdoctoral Science Foundation(Grant No.2021M703371)the Post-Doctoral Foundation Project of Shenzhen Polytechnic(Grant No.6021330002K).
文摘In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues, we propose a speech-text multimodal dual-tower architecture for speech recognition. It employs cross-modal interactions to achieve close semantic alignment during the encoding stage and strengthen its capabilities in modeling auditory long-distance context dependencies. In addition, a two-stage training strategy is elaborately devised to derive semantics-aware acoustic representations effectively. The first stage focuses on pre-training the speech-text multimodal encoding module to enhance inter-modal semantic alignment and aural long-distance context dependencies. The second stage fine-tunes the entire network to bridge the input modality variation gap between the training and inference phases and boost generalization performance. Extensive experiments demonstrate the effectiveness of the proposed speech-text multimodal speech recognition method on the ATCC and AISHELL-1 datasets. It reduces the character error rate to 6.54% and 8.73%, respectively, and exhibits substantial performance gains of 28.76% and 23.82% compared with the best baseline model. The case studies indicate that the obtained semantics-aware acoustic representations aid in accurately recognizing terms with similar pronunciations but distinctive semantics. The research provides a novel modeling paradigm for semantics-aware speech recognition in air traffic control communications, which could contribute to the advancement of intelligent and efficient aviation safety management.
文摘The teaching of English speeches in universities aims to enhance oral communication ability,improve English communication skills,and expand English knowledge,occupying a core position in English teaching in universities.This article takes the theory of second language acquisition as the background,analyzes the important role and value of this theory in English speech teaching in universities,and explores how to apply the theory of second language acquisition in English speech teaching in universities.It aims to strengthen the cultivation of English skilled talents and provide a brief reference for improving English speech teaching in universities.
文摘The theory of indirect speech acts proposed by John Searle is a problematic issue in speech act theory. The theory is subject to various criticisms.This essay reviews various arguments and the significant problems with reference to the indirect speech acts. The review includes some important concepts of speech act theory which are related to indirect speech acts; inference theory and idiom theory which underline indirect speech acts and their major problems in accounting for indirect speech acts.
文摘According to J.L.Austin and J.R.Searle Speech Act Theory,many actions can actually be performed with words that is to say an important part of the meaning of an utterance is what that utterance does.Language is not only used to inform or describe things,it is often used to "do things".The utterances are of three kinds of acts:Locutionary Act,Illocutionary Act,and Perlocutionary Act which are performed simultaneously.Through the study,we see that speech can be far beyond the level of language system.
文摘Pragmatics is a relatively new subject compared with other linguistic studies.In this paper,we will review the definition of pragmatics and its scope.Then a detailed review of speech acts will be shown in the second part of the paper.It is mainly about many aspects of speech acts,such as its definition,classification,felicity conditions and so on.The last part of the paper is the prospects of application of pragmatics.
文摘The current article mainly focuses on identifying pragmatic skills and speech acts,along with illustrating pragmatic failures in sec ond language learners.On the other hand,a brief discussion of the necessity of teaching pragmatic skills and speech acts is provided.
文摘Speech acts theory has been widely studied in linguistics, but applied into analyzing literature works has been very few. This paper interprets Mark Twain's short novel "Is He Living or Is He Dead" from the perspective of the speech acts theory of pragmatics.
文摘Indirect speech acts are frequently used in verbal communication, the interpretation of them is ofgreat importance in order to meet the demands of the development of students’ communicative competence. Thispaper, therefore, intends to present Searle’ s indirect speech acts and explore the way how indirect speech acts areinterpreted in accordance ’with two influential theories. It consists of four parts. Part one gives a generalintroduction to the notion of speech acts theory. Part two makes an elaboration upon the conception of indirect speechact theory proposed by Searle and his supplement and development of illocutionary acts. Part three deals with theinterpretation of indirect speech acts. Part four draws implication from the previous study and also serves as theconclusion of the dissertation .
文摘Indirect speech act theory is a part of pragmatics. In indirect speech acts the speaker often says one thing to mean something else. It is essential for the learners of language to know not only literal meaning of a sentence, but also its illocutionary act. So people can efficiently communicate with each other.This paper uses some examples to illustrate some functions and usage of indirect speech acts.
文摘One of the most compelling notions in pragmatics is Speech Act.Among the subacts, the Perlocutionary Act tells us something about people’ s motivation for using a particular speech act.This paper justifies an interactive approach to speech act with perlocution as the pivot.Then it attempts to explore the stages of the interactive communication process in this new approach.Typical real -life examples are provided respectively in the course of analysis.
基金Deanship of Scientific Research at Majmaah University for supporting this work under Project No.R-2022-166.
文摘Day by day,biometric-based systems play a vital role in our daily lives.This paper proposed an intelligent assistant intended to identify emotions via voice message.A biometric system has been developed to detect human emotions based on voice recognition and control a few electronic peripherals for alert actions.This proposed smart assistant aims to provide a support to the people through buzzer and light emitting diodes(LED)alert signals and it also keep track of the places like households,hospitals and remote areas,etc.The proposed approach is able to detect seven emotions:worry,surprise,neutral,sadness,happiness,hate and love.The key elements for the implementation of speech emotion recognition are voice processing,and once the emotion is recognized,the machine interface automatically detects the actions by buzzer and LED.The proposed system is trained and tested on various benchmark datasets,i.e.,Ryerson Audio-Visual Database of Emotional Speech and Song(RAVDESS)database,Acoustic-Phonetic Continuous Speech Corpus(TIMIT)database,Emotional Speech database(Emo-DB)database and evaluated based on various parameters,i.e.,accuracy,error rate,and time.While comparing with existing technologies,the proposed algorithm gave a better error rate and less time.Error rate and time is decreased by 19.79%,5.13 s.for the RAVDEES dataset,15.77%,0.01 s for the Emo-DB dataset and 14.88%,3.62 for the TIMIT database.The proposed model shows better accuracy of 81.02%for the RAVDEES dataset,84.23%for the TIMIT dataset and 85.12%for the Emo-DB dataset compared to Gaussian Mixture Modeling(GMM)and Support Vector Machine(SVM)Model.
文摘In this study,an evaluation of the presentation of speech acts in six oral English textbooks is conducted from both qualitative and quantitative perspectives to see how speech acts are presented and whether enough and explicit meta-pragmatic and contextual information are provided.Results show that 1)there is a paucity of speech acts and the average percentage of the six textbooks including speech acts is only 28.3%.And some speech acts like‘threatening’,‘warning’,‘declaring’,‘welcoming’are not presented at all.2)Meta-pragmatic and contextual information is too general and far from enough.Among all the six textbooks,from Book 1 to Book 5,contextual information is deduced by learners through reading conversations.Only in Book 6,a contextual description is provided before the conversation begins.Contextual information such as the age,gender and social status of Speaker and Hearer is never presented.Contextual information like the relationship between Speaker and Hearer and the place where the conversation happens is inferred from reading the conversations.Meta-pragmatic information like the degree of formality,politeness strategy,indirect speech act strategy,and social norms are not at all involved.Only in Book 1,a cultural tip is provided.Since oral English textbooks are one of the main sources for Chinese EFL learners to enhance their pragmatic competence,it is much expected that they should present a wide variety of popularly-used speech acts with rich contextual information as appropriate language input.
文摘In cross-cultural communication, miscommunication or misunderstanding is a common phenomenon because peoplefrom different backgrounds have different rules of speech acts and cultural conventions. In order to overcome the difficulties of mis-communication and improve effective cross-cultural communication, more research is needed to study the speech acts from cultureto culture. In this paper, the author mainly studies three patterns of speech acts studied by previous researchers, including apology,greeting, and compliment, discusses what previous researchers studied on the three speech acts in cross-cultural communication,and then analyzes them with real examples in cross-cultural communication.
基金the National Natural Science Foundation of China(No.61872231)the National Key Research and Development Program of China(No.2021YFC2801000)the Major Research plan of the National Social Science Foundation of China(No.2000&ZD130).
文摘Speech emotion recognition,as an important component of humancomputer interaction technology,has received increasing attention.Recent studies have treated emotion recognition of speech signals as a multimodal task,due to its inclusion of the semantic features of two different modalities,i.e.,audio and text.However,existing methods often fail in effectively represent features and capture correlations.This paper presents a multi-level circulant cross-modal Transformer(MLCCT)formultimodal speech emotion recognition.The proposed model can be divided into three steps,feature extraction,interaction and fusion.Self-supervised embedding models are introduced for feature extraction,which give a more powerful representation of the original data than those using spectrograms or audio features such as Mel-frequency cepstral coefficients(MFCCs)and low-level descriptors(LLDs).In particular,MLCCT contains two types of feature interaction processes,where a bidirectional Long Short-term Memory(Bi-LSTM)with circulant interaction mechanism is proposed for low-level features,while a two-stream residual cross-modal Transformer block is appliedwhen high-level features are involved.Finally,we choose self-attention blocks for fusion and a fully connected layer to make predictions.To evaluate the performance of our proposed model,comprehensive experiments are conducted on three widely used benchmark datasets including IEMOCAP,MELD and CMU-MOSEI.The competitive results verify the effectiveness of our approach.
文摘The primary English teacher's speech acts have major impact on foreign language teaching and learning in primaryschool.Application of teacher,s speech acts in the classroom is actually a kind of selective process.From the perspective of SpeechAct Theory,primary English teachers can optimize their speech acts with the strategies of activating the greetings with proper con-text information,standardizing teacher talk,choosing suitable questions,providing appropriate feedback for pupils’classroom per-formances in order to improve the effectiveness of primary teachers,classroom speech acts.
文摘Refusals are frequently performed in our daily lives.Based on the speech act theory of Austin and Searle,with the theoretical frame of the politeness theory put forward by Brown and Levinson,It presents a comparative study of speech acts of refusal in Chinese and American English.The results show that refusals vary in directness with situations and cultures;On the one hand,both languages employ the three directness types,namely the direct refusal speech act,ability of negation and indirect refusal speech act,and prefer indirect refusals.The situational variability of directness in both languages follows a similar trend.On the other hand,Americans are more direct than Chinese.Furthermore,Chinese shows the lower degree of situational variation in the use of the three directness types.People' s choices of refusal strategies are influenced by social power and social distance.From all these evidence,we maintain that the cross-linguistic differences are due to basic differences in cultural values.
文摘The speech act of complaint is an important research subject of pragmatics,which is worthy of research among speech acts.With the development of research into speech acts,some scholars have performed investigations of complaints,but they have done little work on Chinese language complaints.Therefore,it is necessary to make a further study on complaint as a speech act in Chinese.This thesis is based on speech act theory and the politeness principle as an empirical study of the speech act of com plaint in Chinese.It aims to provide a more complete and comprehensive result of participant production of the speech act of complaint.