Adversarial attacks have been posing significant security concerns to intelligent systems,such as speaker recognition systems(SRSs).Most attacks assume the neural networks in the systems are known beforehand,while bla...Adversarial attacks have been posing significant security concerns to intelligent systems,such as speaker recognition systems(SRSs).Most attacks assume the neural networks in the systems are known beforehand,while black-box attacks are proposed without such information to meet practical situations.Existing black-box attacks improve trans-ferability by integrating multiple models or training on multiple datasets,but these methods are costly.Motivated by the optimisation strategy with spatial information on the perturbed paths and samples,we propose a Dual Spatial Momentum Iterative Fast Gradient Sign Method(DS-MI-FGSM)to improve the transferability of black-box at-tacks against SRSs.Specifically,DS-MI-FGSM only needs a single data and one model as the input;by extending to the data and model neighbouring spaces,it generates adver-sarial examples against the integrating models.To reduce the risk of overfitting,DS-MI-FGSM also introduces gradient masking to improve transferability.The authors conduct extensive experiments regarding the speaker recognition task,and the results demonstrate the effectiveness of their method,which can achieve up to 92%attack success rate on the victim model in black-box scenarios with only one known model.展开更多
This paper examines whether or not Chinese native speakers (CNSs) have difficulties in understanding English counterfactuals, whether CNSs have counterfactual reasoning problems in their own language, what the causes ...This paper examines whether or not Chinese native speakers (CNSs) have difficulties in understanding English counterfactuals, whether CNSs have counterfactual reasoning problems in their own language, what the causes of these difficulties may be, and the problems in teaching English subjunctives. It also proposes on how to improve CNSs’ English counterfactual comprehension.展开更多
Grounded upon the interactive relationship between intercultural communication(IC)and foreign language education and the recent gradual salience of communicative language teaching(CLT)in foreign language grammar learn...Grounded upon the interactive relationship between intercultural communication(IC)and foreign language education and the recent gradual salience of communicative language teaching(CLT)in foreign language grammar learning sectors,the study reported in this paper deals with the issue of teaching Korean grammar to non-native speakers in terms of teaching Korean as a foreign language(TKFL).This paper attempts to examine and analyze several Korean language textbooks prepared for foreign learners of Korean,which is used overseas,especially in Hong Kong(HK).It is also attempted to evaluate the textbooks in terms of CLT and communicative competence.By doing so,we can further understand the methods of Korean grammar instruction provided to foreigners as a second language or a foreign language.展开更多
This study examined the NNSs' ability of modifying their interlanguage utterances in modified comprehensible output to give response to other-initiation and self-initiation,which was studied in both NS-NNS and NNS...This study examined the NNSs' ability of modifying their interlanguage utterances in modified comprehensible output to give response to other-initiation and self-initiation,which was studied in both NS-NNS and NNS-NNS interactions.It was the qualitative study by using two different tasks which were picture-dictation task and opinion-exchange task to collect the data.There were 32 participants whose age ranged of 22 to 37.The author proposed two hypotheses based on his expectation that NNS-NNS interactions would provide more opportunities for NNS participants to give comprehensible output for other-initiated clarification requests and self-initiated clarification attempts than NS-NNS interactions.The author was good at using numbers to illustrate and describe the data in his writing.展开更多
An important concern with the deaf community is inability to hear partially or totally. This may affect the development of language during childhood, which limits their habitual existence. Consequently to facilitate s...An important concern with the deaf community is inability to hear partially or totally. This may affect the development of language during childhood, which limits their habitual existence. Consequently to facilitate such deaf speakers through certain assistive mechanism, an effort has been taken to understand the acoustic characteristics of deaf speakers by evaluating the territory specific utterances. Speech signals are acquired from 32 normal and 32 deaf speakers by uttering ten Indian native Tamil language words. The speech parameters like pitch, formants, signal-to-noise ratio, energy, intensity, jitter and shimmer are analyzed. From the results, it has been observed that the acoustic characteristics of deaf speakers differ significantly and their quantitative measure dominates the normal speakers for the words considered. The study also reveals that the informative part of speech in a normal and deaf speakers may be identified using the acoustic features. In addition, these attributes may be used for differential corrections of deaf speaker’s speech signal and facilitate listeners to understand the conveyed information.展开更多
The intelligibility of Thai speakers’English pronunciation from Chinese listeners’perspectives have not been analyzed deeply in current researches.As is known to all,the interconnection between China and Thailand is...The intelligibility of Thai speakers’English pronunciation from Chinese listeners’perspectives have not been analyzed deeply in current researches.As is known to all,the interconnection between China and Thailand is greatly developed with the sup⁃port of‘One Belt and One Road initiative’.However,it is inevitable for non-native English speakers to encounter some pronuncia⁃tion problems,which would affect them to communicate with each other.As for Chinese English learners,Thai accent to some ex⁃tent would be a great challenge during the conversation with Thai speakers.In order to promote a better communication,it is neces⁃sary to analyze the origin of Thai speakers’English pronunciation problems.Therefore,this research based on the previous studies to categorize the common pronunciation problems among Thai speakers.However,there are numerous English learners in Thailand that have various language proficiency.In this respect,two Thai speakers with different IELTS scores have been selected for com⁃parison,and six postgraduate students in the Education University of Hong Kong are invited to serve as a listener.As a result,this research will base on the listeners’feedback to highlight some suggestions to enhance the intelligibility of Thai speakers’pronun⁃ciation.展开更多
The study attempts to explore native and non-native English speakers’attitudes towards accents and pronunciation-related issues.The sample group surveyed is composed of non-native English speakers,specifically,Italia...The study attempts to explore native and non-native English speakers’attitudes towards accents and pronunciation-related issues.The sample group surveyed is composed of non-native English speakers,specifically,Italian students studying at the University of Calabria(Italy)and native English speakers from Alberta University(Canada)and Florida Atlantic University(USA).An online link to a questionnaire was sent via email to all participants and was used as a research instrument to collect quantitative data.The research questions will investigate learners’attitudes in relation to the following aspects:accent and identity,beliefs about native/non-native accents,impact of pronunciation on communication,and learners’expectations towards pronunciation teaching.Firstly,mean scores in relation to the aforementioned aspects will be examined.Secondly,differences between native/non-native speakers’responses will be statistically analysed.Thirdly,non-native learners’responses will be correlated with their proficiency level in English to identify the extent to which language competence may affect learners’attitudes.The study aims to gain useful insights that may hopefully raise students and teachers’awareness of what models we expect learners to imitate and attain in the English language classroom,how appropriate and relevant these may be especially in the globalized English world where non-native speakers will increasingly use English in a diversity of forms to achieve their communicative goals.The preliminary results will be presented and pedagogical considerations suggested.展开更多
The paper concerns the issue of ELF (English as a lingua franca) in the European and Asian context. The authors start from a brief conceptual perspective to shed light on salient aspects related to ELF. Then, this p...The paper concerns the issue of ELF (English as a lingua franca) in the European and Asian context. The authors start from a brief conceptual perspective to shed light on salient aspects related to ELF. Then, this paper discusses the study investigating the interactions among NNS (non-native speakers) of English in the naturalistic settings, namely in Zhangjiajie (China), Masouri (Kalymnos/Greece), and Unterwasser (Switzerland). The main objective of the research based on the qualitative methodology was to analyze the ELF interactions from the linguistic point of view focusing on lexicogrammar and pragmatic features. The secondary objective was to establish whether the identified ELF features contributed to communication intelligibility. The obtained results indicated a few significant similarities with the Seidlhofer's list of the ELT characteristics. Furthermore, it was established in the study that the ELF features did not interfere with effective communication between interlocutors展开更多
Automatic Speaker Identification(ASI)involves the process of distinguishing an audio stream associated with numerous speakers’utterances.Some common aspects,such as the framework difference,overlapping of different s...Automatic Speaker Identification(ASI)involves the process of distinguishing an audio stream associated with numerous speakers’utterances.Some common aspects,such as the framework difference,overlapping of different sound events,and the presence of various sound sources during recording,make the ASI task much more complicated and complex.This research proposes a deep learning model to improve the accuracy of the ASI system and reduce the model training time under limited computation resources.In this research,the performance of the transformer model is investigated.Seven audio features,chromagram,Mel-spectrogram,tonnetz,Mel-Frequency Cepstral Coefficients(MFCCs),delta MFCCs,delta-delta MFCCs and spectral contrast,are extracted from the ELSDSR,CSTRVCTK,and Ar-DAD,datasets.The evaluation of various experiments demonstrates that the best performance was achieved by the proposed transformer model using seven audio features on all datasets.For ELSDSR,CSTRVCTK,and Ar-DAD,the highest attained accuracies are 0.99,0.97,and 0.99,respectively.The experimental results reveal that the proposed technique can achieve the best performance for ASI problems.展开更多
Most current security and authentication systems are based on personal biometrics.The security problem is a major issue in the field of biometric systems.This is due to the use in databases of the original biometrics....Most current security and authentication systems are based on personal biometrics.The security problem is a major issue in the field of biometric systems.This is due to the use in databases of the original biometrics.Then biometrics will forever be lost if these databases are attacked.Protecting privacy is the most important goal of cancelable biometrics.In order to protect privacy,therefore,cancelable biometrics should be non-invertible in such a way that no information can be inverted from the cancelable biometric templates stored in personal identification/verification databases.One methodology to achieve non-invertibility is the employment of non-invertible transforms.This work suggests an encryption process for cancellable speaker identification using a hybrid encryption system.This system includes the 3D Jigsaw transforms and Fractional Fourier Transform(FrFT).The proposed scheme is compared with the optical Double Random Phase Encoding(DRPE)encryption process.The evaluation of simulation results of cancellable biometrics shows that the algorithm proposed is secure,authoritative,and feasible.The encryption and cancelability effects are good and reveal good performance.Also,it introduces recommended security and robustness levels for its utilization for achieving efficient cancellable biometrics systems.展开更多
The use of voice to perform biometric authentication is an importanttechnological development,because it is a non-invasive identification methodand does not require special hardware,so it is less likely to arouse user...The use of voice to perform biometric authentication is an importanttechnological development,because it is a non-invasive identification methodand does not require special hardware,so it is less likely to arouse user disgust.This study tries to apply the voice recognition technology to the speech-driveninteractive voice response questionnaire system aiming to upgrade the traditionalspeech system to an intelligent voice response questionnaire network so that thenew device may offer enterprises more precise data for customer relationshipmanagement(CRM).The intelligence-type voice response gadget is becominga new mobile channel at the current time,with functions of the questionnaireto be built in for the convenience of collecting information on local preferencesthat can be used for localized promotion and publicity.Authors of this study propose a framework using voice recognition and intelligent analysis models to identify target customers through voice messages gathered in the voice response questionnaire system;that is,transforming the traditional speech system to anintelligent voice complex.The speaker recognition system discussed hereemploys volume as the acoustic feature in endpoint detection as the computationload is usually low in this method.To correct two types of errors found in the endpoint detection practice because of ambient noise,this study suggests ways toimprove the situation.First,to reach high accuracy,this study follows a dynamictime warping(DTW)based method to gain speaker identification.Second,it isdevoted to avoiding any errors in endpoint detection by filtering noise from voicesignals before getting recognition and deleting any test utterances that might negatively affect the results of recognition.It is hoped that by so doing the recognitionrate is improved.According to the experimental results,the method proposed inthis research has a high recognition rate,whether it is on personal-level or industrial-level computers,and can reach the practical application standard.Therefore,the voice management system in this research can be regarded as Virtual customerservice staff to use.展开更多
Previous studies have investigated the efficiency in teaching listener and speaker repertoires in children diagnosed with autism spectrum disorder(ASD).Some investigations focused on listener responding by function,fe...Previous studies have investigated the efficiency in teaching listener and speaker repertoires in children diagnosed with autism spectrum disorder(ASD).Some investigations focused on listener responding by function,feature,and class(LRFFC)and intraverbal by function,feature,and class(FFC).For some children,teaching intraverbal FFC was more efficient because it resulted in a better emergence effect of a related untaught repertoire(LRFFC).For other children,teaching LRFFC along with tacting pictures was more efficient,resulting in a better emergence effect of a related untaught repertoire(intraverbal FFC).In these cases,it is not clear whether the tact increased the efficiency of LRFFC training because a comparison with a condition in which tacts were not required was not conducted.This investigation consisted of a replication with two children diagnosed with ASD.Three instructional sequences were compared:teaching LRFFC-probing intraverbal;teaching LRFFC+tacts-probing intraverbal;teaching intraverbal-probing LRFFC.For a child,all sequences were equally efficient because all related untaught repertoires emerged without errors.However,the acquisition of intraverbals during training occurred with variability.In the case of the second child,the most efficient sequence consisted of teaching intraverbals,resulting in the emergence of LRFFC without errors.In both cases of teaching LRFFC,the emergence of related intraverbals was partial and acquisition of the trained repertoires occurred with variability.The case that did not demand tact responses was slightly more efficient.Data were discussed in the sense that the best instructional sequence may vary from learner to learner.展开更多
基金The Major Key Project of PCL,Grant/Award Number:PCL2022A03National Natural Science Foundation of China,Grant/Award Numbers:61976064,62372137Zhejiang Provincial Natural Science Foundation of China,Grant/Award Number:LZ22F020007。
文摘Adversarial attacks have been posing significant security concerns to intelligent systems,such as speaker recognition systems(SRSs).Most attacks assume the neural networks in the systems are known beforehand,while black-box attacks are proposed without such information to meet practical situations.Existing black-box attacks improve trans-ferability by integrating multiple models or training on multiple datasets,but these methods are costly.Motivated by the optimisation strategy with spatial information on the perturbed paths and samples,we propose a Dual Spatial Momentum Iterative Fast Gradient Sign Method(DS-MI-FGSM)to improve the transferability of black-box at-tacks against SRSs.Specifically,DS-MI-FGSM only needs a single data and one model as the input;by extending to the data and model neighbouring spaces,it generates adver-sarial examples against the integrating models.To reduce the risk of overfitting,DS-MI-FGSM also introduces gradient masking to improve transferability.The authors conduct extensive experiments regarding the speaker recognition task,and the results demonstrate the effectiveness of their method,which can achieve up to 92%attack success rate on the victim model in black-box scenarios with only one known model.
文摘This paper examines whether or not Chinese native speakers (CNSs) have difficulties in understanding English counterfactuals, whether CNSs have counterfactual reasoning problems in their own language, what the causes of these difficulties may be, and the problems in teaching English subjunctives. It also proposes on how to improve CNSs’ English counterfactual comprehension.
文摘Grounded upon the interactive relationship between intercultural communication(IC)and foreign language education and the recent gradual salience of communicative language teaching(CLT)in foreign language grammar learning sectors,the study reported in this paper deals with the issue of teaching Korean grammar to non-native speakers in terms of teaching Korean as a foreign language(TKFL).This paper attempts to examine and analyze several Korean language textbooks prepared for foreign learners of Korean,which is used overseas,especially in Hong Kong(HK).It is also attempted to evaluate the textbooks in terms of CLT and communicative competence.By doing so,we can further understand the methods of Korean grammar instruction provided to foreigners as a second language or a foreign language.
文摘This study examined the NNSs' ability of modifying their interlanguage utterances in modified comprehensible output to give response to other-initiation and self-initiation,which was studied in both NS-NNS and NNS-NNS interactions.It was the qualitative study by using two different tasks which were picture-dictation task and opinion-exchange task to collect the data.There were 32 participants whose age ranged of 22 to 37.The author proposed two hypotheses based on his expectation that NNS-NNS interactions would provide more opportunities for NNS participants to give comprehensible output for other-initiated clarification requests and self-initiated clarification attempts than NS-NNS interactions.The author was good at using numbers to illustrate and describe the data in his writing.
文摘An important concern with the deaf community is inability to hear partially or totally. This may affect the development of language during childhood, which limits their habitual existence. Consequently to facilitate such deaf speakers through certain assistive mechanism, an effort has been taken to understand the acoustic characteristics of deaf speakers by evaluating the territory specific utterances. Speech signals are acquired from 32 normal and 32 deaf speakers by uttering ten Indian native Tamil language words. The speech parameters like pitch, formants, signal-to-noise ratio, energy, intensity, jitter and shimmer are analyzed. From the results, it has been observed that the acoustic characteristics of deaf speakers differ significantly and their quantitative measure dominates the normal speakers for the words considered. The study also reveals that the informative part of speech in a normal and deaf speakers may be identified using the acoustic features. In addition, these attributes may be used for differential corrections of deaf speaker’s speech signal and facilitate listeners to understand the conveyed information.
文摘The intelligibility of Thai speakers’English pronunciation from Chinese listeners’perspectives have not been analyzed deeply in current researches.As is known to all,the interconnection between China and Thailand is greatly developed with the sup⁃port of‘One Belt and One Road initiative’.However,it is inevitable for non-native English speakers to encounter some pronuncia⁃tion problems,which would affect them to communicate with each other.As for Chinese English learners,Thai accent to some ex⁃tent would be a great challenge during the conversation with Thai speakers.In order to promote a better communication,it is neces⁃sary to analyze the origin of Thai speakers’English pronunciation problems.Therefore,this research based on the previous studies to categorize the common pronunciation problems among Thai speakers.However,there are numerous English learners in Thailand that have various language proficiency.In this respect,two Thai speakers with different IELTS scores have been selected for com⁃parison,and six postgraduate students in the Education University of Hong Kong are invited to serve as a listener.As a result,this research will base on the listeners’feedback to highlight some suggestions to enhance the intelligibility of Thai speakers’pronun⁃ciation.
文摘The study attempts to explore native and non-native English speakers’attitudes towards accents and pronunciation-related issues.The sample group surveyed is composed of non-native English speakers,specifically,Italian students studying at the University of Calabria(Italy)and native English speakers from Alberta University(Canada)and Florida Atlantic University(USA).An online link to a questionnaire was sent via email to all participants and was used as a research instrument to collect quantitative data.The research questions will investigate learners’attitudes in relation to the following aspects:accent and identity,beliefs about native/non-native accents,impact of pronunciation on communication,and learners’expectations towards pronunciation teaching.Firstly,mean scores in relation to the aforementioned aspects will be examined.Secondly,differences between native/non-native speakers’responses will be statistically analysed.Thirdly,non-native learners’responses will be correlated with their proficiency level in English to identify the extent to which language competence may affect learners’attitudes.The study aims to gain useful insights that may hopefully raise students and teachers’awareness of what models we expect learners to imitate and attain in the English language classroom,how appropriate and relevant these may be especially in the globalized English world where non-native speakers will increasingly use English in a diversity of forms to achieve their communicative goals.The preliminary results will be presented and pedagogical considerations suggested.
文摘The paper concerns the issue of ELF (English as a lingua franca) in the European and Asian context. The authors start from a brief conceptual perspective to shed light on salient aspects related to ELF. Then, this paper discusses the study investigating the interactions among NNS (non-native speakers) of English in the naturalistic settings, namely in Zhangjiajie (China), Masouri (Kalymnos/Greece), and Unterwasser (Switzerland). The main objective of the research based on the qualitative methodology was to analyze the ELF interactions from the linguistic point of view focusing on lexicogrammar and pragmatic features. The secondary objective was to establish whether the identified ELF features contributed to communication intelligibility. The obtained results indicated a few significant similarities with the Seidlhofer's list of the ELT characteristics. Furthermore, it was established in the study that the ELF features did not interfere with effective communication between interlocutors
基金The authors are grateful to the Taif University Researchers Supporting Project Number(TURSP-2020/36)Taif University,Taif,Saudi Arabia.
文摘Automatic Speaker Identification(ASI)involves the process of distinguishing an audio stream associated with numerous speakers’utterances.Some common aspects,such as the framework difference,overlapping of different sound events,and the presence of various sound sources during recording,make the ASI task much more complicated and complex.This research proposes a deep learning model to improve the accuracy of the ASI system and reduce the model training time under limited computation resources.In this research,the performance of the transformer model is investigated.Seven audio features,chromagram,Mel-spectrogram,tonnetz,Mel-Frequency Cepstral Coefficients(MFCCs),delta MFCCs,delta-delta MFCCs and spectral contrast,are extracted from the ELSDSR,CSTRVCTK,and Ar-DAD,datasets.The evaluation of various experiments demonstrates that the best performance was achieved by the proposed transformer model using seven audio features on all datasets.For ELSDSR,CSTRVCTK,and Ar-DAD,the highest attained accuracies are 0.99,0.97,and 0.99,respectively.The experimental results reveal that the proposed technique can achieve the best performance for ASI problems.
文摘Most current security and authentication systems are based on personal biometrics.The security problem is a major issue in the field of biometric systems.This is due to the use in databases of the original biometrics.Then biometrics will forever be lost if these databases are attacked.Protecting privacy is the most important goal of cancelable biometrics.In order to protect privacy,therefore,cancelable biometrics should be non-invertible in such a way that no information can be inverted from the cancelable biometric templates stored in personal identification/verification databases.One methodology to achieve non-invertibility is the employment of non-invertible transforms.This work suggests an encryption process for cancellable speaker identification using a hybrid encryption system.This system includes the 3D Jigsaw transforms and Fractional Fourier Transform(FrFT).The proposed scheme is compared with the optical Double Random Phase Encoding(DRPE)encryption process.The evaluation of simulation results of cancellable biometrics shows that the algorithm proposed is secure,authoritative,and feasible.The encryption and cancelability effects are good and reveal good performance.Also,it introduces recommended security and robustness levels for its utilization for achieving efficient cancellable biometrics systems.
文摘The use of voice to perform biometric authentication is an importanttechnological development,because it is a non-invasive identification methodand does not require special hardware,so it is less likely to arouse user disgust.This study tries to apply the voice recognition technology to the speech-driveninteractive voice response questionnaire system aiming to upgrade the traditionalspeech system to an intelligent voice response questionnaire network so that thenew device may offer enterprises more precise data for customer relationshipmanagement(CRM).The intelligence-type voice response gadget is becominga new mobile channel at the current time,with functions of the questionnaireto be built in for the convenience of collecting information on local preferencesthat can be used for localized promotion and publicity.Authors of this study propose a framework using voice recognition and intelligent analysis models to identify target customers through voice messages gathered in the voice response questionnaire system;that is,transforming the traditional speech system to anintelligent voice complex.The speaker recognition system discussed hereemploys volume as the acoustic feature in endpoint detection as the computationload is usually low in this method.To correct two types of errors found in the endpoint detection practice because of ambient noise,this study suggests ways toimprove the situation.First,to reach high accuracy,this study follows a dynamictime warping(DTW)based method to gain speaker identification.Second,it isdevoted to avoiding any errors in endpoint detection by filtering noise from voicesignals before getting recognition and deleting any test utterances that might negatively affect the results of recognition.It is hoped that by so doing the recognitionrate is improved.According to the experimental results,the method proposed inthis research has a high recognition rate,whether it is on personal-level or industrial-level computers,and can reach the practical application standard.Therefore,the voice management system in this research can be regarded as Virtual customerservice staff to use.
文摘Previous studies have investigated the efficiency in teaching listener and speaker repertoires in children diagnosed with autism spectrum disorder(ASD).Some investigations focused on listener responding by function,feature,and class(LRFFC)and intraverbal by function,feature,and class(FFC).For some children,teaching intraverbal FFC was more efficient because it resulted in a better emergence effect of a related untaught repertoire(LRFFC).For other children,teaching LRFFC along with tacting pictures was more efficient,resulting in a better emergence effect of a related untaught repertoire(intraverbal FFC).In these cases,it is not clear whether the tact increased the efficiency of LRFFC training because a comparison with a condition in which tacts were not required was not conducted.This investigation consisted of a replication with two children diagnosed with ASD.Three instructional sequences were compared:teaching LRFFC-probing intraverbal;teaching LRFFC+tacts-probing intraverbal;teaching intraverbal-probing LRFFC.For a child,all sequences were equally efficient because all related untaught repertoires emerged without errors.However,the acquisition of intraverbals during training occurred with variability.In the case of the second child,the most efficient sequence consisted of teaching intraverbals,resulting in the emergence of LRFFC without errors.In both cases of teaching LRFFC,the emergence of related intraverbals was partial and acquisition of the trained repertoires occurred with variability.The case that did not demand tact responses was slightly more efficient.Data were discussed in the sense that the best instructional sequence may vary from learner to learner.