A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize...A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize the enhanced whisper. A novel noise robust feature called Gammatone feature cosine coefficients (GFCCs) extracted by an auditory periphery model is derived and used for the binary mask estimation. The intelligibility performance of the proposed method is evaluated and compared with the traditional speech enhancement methods. Objective and subjective evaluation results indicate that the proposed method can effectively improve the intelligibility of whispered speech which is contaminated by noise. Compared with the power subtract algorithm and the log-MMSE algorithm, both of which do not improve the intelligibility in lower signal-to-noise ratio (SNR) environments, the proposed method has good performance in improving the intelligibility of noisy whisper. Additionally, the intelligibility of the enhanced whispered speech using the proposed method also outperforms that of the corresponding unprocessed noisy whispered speech.展开更多
Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyze...Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyzed. Experimental results show that for a spectrum of some density, the joint time-frequency gain-modification based speech enhancement algorithm achieves significant improvement in intelligibility. Additionally, the spectrum region where the estimated spectrum is smaller than the clean spectrum, is the most important region contributing to intelligibility improvement for the enhanced whisper. The spectrum region where the estimated spectrum is larger than twice the size of the clean spectrum is detrimental to speech intelligibility perception within the whisper context.展开更多
The cognitive performance-based dimensional emotion recognition in whispered speech is studied.First,the whispered speech emotion databases and data collection methods are compared, and the character of emotion expres...The cognitive performance-based dimensional emotion recognition in whispered speech is studied.First,the whispered speech emotion databases and data collection methods are compared, and the character of emotion expression in whispered speech is studied,especially the basic types of emotions.Secondly,the emotion features for whispered speech is analyzed,and by reviewing the latest references,the related valence features and the arousal features are provided. The effectiveness of valence and arousal features in whispered speech emotion classification is studied.Finally,the Gaussian mixture model is studied and applied to whispered speech emotion recognition. The cognitive performance is also considered in emotion recognition so that the recognition errors of whispered speech emotion can be corrected.Based on the cognitive scores,the emotion recognition results can be improved.The results show that the formant features are not significantly related to arousal dimension,while the short-term energy features are related to the emotion changes in arousal dimension.Using the cognitive scores,the recognition results can be improved.展开更多
The Autoregressive Moving Average (ARMA) model for whispered speech is proposed. with normal speech, whispered speech has no fundamental frequency because of the glottis being semi-opened and turbulent flow being cr...The Autoregressive Moving Average (ARMA) model for whispered speech is proposed. with normal speech, whispered speech has no fundamental frequency because of the glottis being semi-opened and turbulent flow being created, and formant shifting exists in the lower frequency region due to the narrowing of the tract in the false vocal fold regions and weak acoustic coupling with the aubglottal system. Analysis shows that the effect of the subglottal system is to introduce additional pole-zero pairs into the vocal tract transfer function. Theoretically, the method based on an ARMA process is superior to that based on an AR process in the spectral analysis of the whispered speech. Two methods, the least squared modified Yule-Walker likelihood estimate (LSMY) algorithm and the Frequency-Domain Steiglitz-Mcbide (FDSM) algorithm, are applied to the ARMA mfldel for the whispered speech. The performance evaluation shows that the ARMA model is much more appropriate for representing the whispered speech than the AR model, and the FDSM algorithm provides a name acorate estimation of the whispered speech spectral envelope than the LSMY algorithm with higher conputational complexity.展开更多
Curriculum development efforts are widely investigated and studied by many researchers in the world. However, there are only few studies in number that specifically deal with curriculum development in the field of Eng...Curriculum development efforts are widely investigated and studied by many researchers in the world. However, there are only few studies in number that specifically deal with curriculum development in the field of English language teaching and its implementation in the Turkish context. Hence, the present study was designed to investigate Turkish national education curriculum and the sixth grade EFL (English as a foreign language) curriculum in Turkey through analyzing the documents belong to Turkish MoNE (ministry of national education). This study first presents what national education and the sixth grade EFL curriculum involve, and then investigates the drawbacks of them based on the predetermined criteria. Finally, this study concludes with suggestions to overcome the mentioned shortcomings.展开更多
In advertisements directed at consumers within a society or others ocieties, brands employ cultural signs (values, beliefs, rituals, and heroes and symbols) and in accordance, it can be observed that they consciousl...In advertisements directed at consumers within a society or others ocieties, brands employ cultural signs (values, beliefs, rituals, and heroes and symbols) and in accordance, it can be observed that they consciously make use of the terms "locality" and "globality". In this study, four global food brands' advertisements including cultural codes, locality, and globality have been randomly selected and analyzed. These advertising messages have been analyzed at an intercultural level from visual semiotics perspective. It has been tried to determine the "local" approaches of global brands by revealing the "intercultural" dimension transferred through visual and linguistic signs in the advertisements which we reselected with an eclectic method.展开更多
Acquisition of speaking Turkish students since skills is among the most essential language skills for students currently, especially among language competency is measured through speaking. Every language learner desir...Acquisition of speaking Turkish students since skills is among the most essential language skills for students currently, especially among language competency is measured through speaking. Every language learner desires to acquire correct communicative skills and fluency in speaking, while teachers aim to give the necessary education in speaking, as well. This study reveals the essence of acquiring speaking competency for learners, and how important it is to teach speaking skills with correct methods and methodology. With the questionnaire analysis on Turkish students and teachers at a Turkish university, the theoretical knowledge on "essence" is blended with the questionnaire results provided from students and teachers in order to emphasize the necessity and reach more practical solutions for the current need of learning and teaching speaking.展开更多
Global changes took place at a neck-breaking speed in lots of fields along with the Web 2.0 era, which can be stated as the new Internet trend. Web pages which once were a statical structure that can be said to become...Global changes took place at a neck-breaking speed in lots of fields along with the Web 2.0 era, which can be stated as the new Internet trend. Web pages which once were a statical structure that can be said to become dynamic pages created by users, and in this regard they can be said to have been democratized by evolving. Social media, which were structured alongside with this era, by providing a large data flow for businesses, present new and improvable opportunities in the field of creating effective strategies. There are lots of blogs in today's Internet environment which includes customer ideas regarding the products/services that they possess. This environment, which in a way globalizes the customer ideas, is a new medium suitable for examination in terms of its increasing the business-customer interaction and due to its transporter nature; it provides the text data that may be analyzed in the field of Customer Relationship Management to businesses. Thus, businesses should follow blog environments to see how the product/service they provide is greeted in terms of the customer focus and it should be seen as an important job on which they can conduct effective analyses. For this purpose, a model proposal that will assign the ideas to the Turkish blogs was given in the study. Opinion mining methods were used in the model, and so to perceive a general look-on about products/services, a methodology was devised, which will assign the text based opinion data on the Turkish blogs to the poles. Success of the pole assignment of the model is evaluated with the precision measure.展开更多
基金The National Natural Science Foundation of China (No.61231002,61273266,51075068,60872073,60975017, 61003131)the Ph.D.Programs Foundation of the Ministry of Education of China(No.20110092130004)+1 种基金the Science Foundation for Young Talents in the Educational Committee of Anhui Province(No. 2010SQRL018)the 211 Project of Anhui University(No.2009QN027B)
文摘A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize the enhanced whisper. A novel noise robust feature called Gammatone feature cosine coefficients (GFCCs) extracted by an auditory periphery model is derived and used for the binary mask estimation. The intelligibility performance of the proposed method is evaluated and compared with the traditional speech enhancement methods. Objective and subjective evaluation results indicate that the proposed method can effectively improve the intelligibility of whispered speech which is contaminated by noise. Compared with the power subtract algorithm and the log-MMSE algorithm, both of which do not improve the intelligibility in lower signal-to-noise ratio (SNR) environments, the proposed method has good performance in improving the intelligibility of noisy whisper. Additionally, the intelligibility of the enhanced whispered speech using the proposed method also outperforms that of the corresponding unprocessed noisy whispered speech.
基金The National Natural Science Foundation of China(No.61301295,61273266,61301219,61201326,61003131)the Natural Science Foundation of Anhui Province(No.1308085QF100,1408085MF113)+2 种基金the Natural Science Foundation of Jiangsu Province(No.BK20130241)the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.12KJB510021)the Doctoral Fund of Anhui University
文摘Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyzed. Experimental results show that for a spectrum of some density, the joint time-frequency gain-modification based speech enhancement algorithm achieves significant improvement in intelligibility. Additionally, the spectrum region where the estimated spectrum is smaller than the clean spectrum, is the most important region contributing to intelligibility improvement for the enhanced whisper. The spectrum region where the estimated spectrum is larger than twice the size of the clean spectrum is detrimental to speech intelligibility perception within the whisper context.
基金The National Natural Science Foundation of China(No.11401412)
文摘The cognitive performance-based dimensional emotion recognition in whispered speech is studied.First,the whispered speech emotion databases and data collection methods are compared, and the character of emotion expression in whispered speech is studied,especially the basic types of emotions.Secondly,the emotion features for whispered speech is analyzed,and by reviewing the latest references,the related valence features and the arousal features are provided. The effectiveness of valence and arousal features in whispered speech emotion classification is studied.Finally,the Gaussian mixture model is studied and applied to whispered speech emotion recognition. The cognitive performance is also considered in emotion recognition so that the recognition errors of whispered speech emotion can be corrected.Based on the cognitive scores,the emotion recognition results can be improved.The results show that the formant features are not significantly related to arousal dimension,while the short-term energy features are related to the emotion changes in arousal dimension.Using the cognitive scores,the recognition results can be improved.
基金supported by the Independent Innovation Foundation of Shandong University(No.2009JC004)the Natural Science Foundation of Shandong Province(No.Y2007G31)
文摘The Autoregressive Moving Average (ARMA) model for whispered speech is proposed. with normal speech, whispered speech has no fundamental frequency because of the glottis being semi-opened and turbulent flow being created, and formant shifting exists in the lower frequency region due to the narrowing of the tract in the false vocal fold regions and weak acoustic coupling with the aubglottal system. Analysis shows that the effect of the subglottal system is to introduce additional pole-zero pairs into the vocal tract transfer function. Theoretically, the method based on an ARMA process is superior to that based on an AR process in the spectral analysis of the whispered speech. Two methods, the least squared modified Yule-Walker likelihood estimate (LSMY) algorithm and the Frequency-Domain Steiglitz-Mcbide (FDSM) algorithm, are applied to the ARMA mfldel for the whispered speech. The performance evaluation shows that the ARMA model is much more appropriate for representing the whispered speech than the AR model, and the FDSM algorithm provides a name acorate estimation of the whispered speech spectral envelope than the LSMY algorithm with higher conputational complexity.
文摘Curriculum development efforts are widely investigated and studied by many researchers in the world. However, there are only few studies in number that specifically deal with curriculum development in the field of English language teaching and its implementation in the Turkish context. Hence, the present study was designed to investigate Turkish national education curriculum and the sixth grade EFL (English as a foreign language) curriculum in Turkey through analyzing the documents belong to Turkish MoNE (ministry of national education). This study first presents what national education and the sixth grade EFL curriculum involve, and then investigates the drawbacks of them based on the predetermined criteria. Finally, this study concludes with suggestions to overcome the mentioned shortcomings.
文摘In advertisements directed at consumers within a society or others ocieties, brands employ cultural signs (values, beliefs, rituals, and heroes and symbols) and in accordance, it can be observed that they consciously make use of the terms "locality" and "globality". In this study, four global food brands' advertisements including cultural codes, locality, and globality have been randomly selected and analyzed. These advertising messages have been analyzed at an intercultural level from visual semiotics perspective. It has been tried to determine the "local" approaches of global brands by revealing the "intercultural" dimension transferred through visual and linguistic signs in the advertisements which we reselected with an eclectic method.
文摘Acquisition of speaking Turkish students since skills is among the most essential language skills for students currently, especially among language competency is measured through speaking. Every language learner desires to acquire correct communicative skills and fluency in speaking, while teachers aim to give the necessary education in speaking, as well. This study reveals the essence of acquiring speaking competency for learners, and how important it is to teach speaking skills with correct methods and methodology. With the questionnaire analysis on Turkish students and teachers at a Turkish university, the theoretical knowledge on "essence" is blended with the questionnaire results provided from students and teachers in order to emphasize the necessity and reach more practical solutions for the current need of learning and teaching speaking.
文摘Global changes took place at a neck-breaking speed in lots of fields along with the Web 2.0 era, which can be stated as the new Internet trend. Web pages which once were a statical structure that can be said to become dynamic pages created by users, and in this regard they can be said to have been democratized by evolving. Social media, which were structured alongside with this era, by providing a large data flow for businesses, present new and improvable opportunities in the field of creating effective strategies. There are lots of blogs in today's Internet environment which includes customer ideas regarding the products/services that they possess. This environment, which in a way globalizes the customer ideas, is a new medium suitable for examination in terms of its increasing the business-customer interaction and due to its transporter nature; it provides the text data that may be analyzed in the field of Customer Relationship Management to businesses. Thus, businesses should follow blog environments to see how the product/service they provide is greeted in terms of the customer focus and it should be seen as an important job on which they can conduct effective analyses. For this purpose, a model proposal that will assign the ideas to the Turkish blogs was given in the study. Opinion mining methods were used in the model, and so to perceive a general look-on about products/services, a methodology was devised, which will assign the text based opinion data on the Turkish blogs to the poles. Success of the pole assignment of the model is evaluated with the precision measure.