A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize...A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize the enhanced whisper. A novel noise robust feature called Gammatone feature cosine coefficients (GFCCs) extracted by an auditory periphery model is derived and used for the binary mask estimation. The intelligibility performance of the proposed method is evaluated and compared with the traditional speech enhancement methods. Objective and subjective evaluation results indicate that the proposed method can effectively improve the intelligibility of whispered speech which is contaminated by noise. Compared with the power subtract algorithm and the log-MMSE algorithm, both of which do not improve the intelligibility in lower signal-to-noise ratio (SNR) environments, the proposed method has good performance in improving the intelligibility of noisy whisper. Additionally, the intelligibility of the enhanced whispered speech using the proposed method also outperforms that of the corresponding unprocessed noisy whispered speech.展开更多
本文提出利用级联声光效应器和耦合回音壁模式微球腔的方案来实现非对称传输效果,并进行理论和实验验证.实验中利用加热拉锥的方式制备了两段式光纤,可同时实现声光效应的激发和回音壁模式的耦合.利用光纤中声光效应将纤芯基模中的矢量...本文提出利用级联声光效应器和耦合回音壁模式微球腔的方案来实现非对称传输效果,并进行理论和实验验证.实验中利用加热拉锥的方式制备了两段式光纤,可同时实现声光效应的激发和回音壁模式的耦合.利用光纤中声光效应将纤芯基模中的矢量模式转换到包层高阶模式,由于基模中不同矢量模式转换包层模式的矢量模式也不同,从而产生类似双折射效果,使输出的包层模式产生偏振变化.而后通过耦合回音壁模式微腔将包层模式转换回纤芯基模.由于回音壁模式的偏振选择效果,使得相反方向入射光能量具有不同的透射特性,其传输隔离度可达17 d B.此外,对两个方向传输的透射率随偏振角度变化进行测试,测得声光效应带来的偏振变化约为80°.本文的非对称传输方案继承了声光器件响应迅速、调谐性良好的优势,同时具有全光纤结构和无工作阈值的特点,在光开关、光隔离器等场景具有重要的应用潜力.展开更多
Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyze...Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyzed. Experimental results show that for a spectrum of some density, the joint time-frequency gain-modification based speech enhancement algorithm achieves significant improvement in intelligibility. Additionally, the spectrum region where the estimated spectrum is smaller than the clean spectrum, is the most important region contributing to intelligibility improvement for the enhanced whisper. The spectrum region where the estimated spectrum is larger than twice the size of the clean spectrum is detrimental to speech intelligibility perception within the whisper context.展开更多
The cognitive performance-based dimensional emotion recognition in whispered speech is studied.First,the whispered speech emotion databases and data collection methods are compared, and the character of emotion expres...The cognitive performance-based dimensional emotion recognition in whispered speech is studied.First,the whispered speech emotion databases and data collection methods are compared, and the character of emotion expression in whispered speech is studied,especially the basic types of emotions.Secondly,the emotion features for whispered speech is analyzed,and by reviewing the latest references,the related valence features and the arousal features are provided. The effectiveness of valence and arousal features in whispered speech emotion classification is studied.Finally,the Gaussian mixture model is studied and applied to whispered speech emotion recognition. The cognitive performance is also considered in emotion recognition so that the recognition errors of whispered speech emotion can be corrected.Based on the cognitive scores,the emotion recognition results can be improved.The results show that the formant features are not significantly related to arousal dimension,while the short-term energy features are related to the emotion changes in arousal dimension.Using the cognitive scores,the recognition results can be improved.展开更多
During the last decades the whispering gallery mode based sensors have become a prominent solution for label-free sensing of various physical and chemical parameters.At the same time,the widespread utilization of the ...During the last decades the whispering gallery mode based sensors have become a prominent solution for label-free sensing of various physical and chemical parameters.At the same time,the widespread utilization of the approach is hindered by the restricted applicability of the known configurations for ambient variations quantification outside the laboratory conditions and their low affordability,where necessity on the spectrally-resolved data collection is among the main limiting factors.In this paper we demonstrate the first realization of an affordable whispering gallery mode sensor powered by deep learning and multi-resonator imaging at a fixed frequency.It has been shown that the approach enables refractive index unit(RIU)prediction with an absolute error at 3×10^(-6) level for dynamic range of the RIU variations from 0 to 2×10^(-3) with temporal resolution of several milliseconds and instrument-driven detection limit of 3×10−5.High sensing accuracy together with instrumental affordability and production simplicity places the reported detector among the most cost-effective realizations of the whispering gallery mode approach.The proposed solution is expected to have a great impact on the shift of the whole sensing paradigm away from the model-based and to the flexible self-learning solutions.展开更多
A simple fiber sensor to measure angular displacement with high resolution, which is based on whispering gallery mode (WGM) resonance in bent optical fibers,is proposed. The sensor is composed of a single loop forme...A simple fiber sensor to measure angular displacement with high resolution, which is based on whispering gallery mode (WGM) resonance in bent optical fibers,is proposed. The sensor is composed of a single loop formed by loosely tying a knot using single mode fiber. To measure the transmission spectra, a tunable laser and an optic power meter are connected to the two ends of fi- ber loop, respectively. Significant WGM resonances occur over the investigated wavelength range for all the sensors with different bend radius. The angular-displacement sensitivity is studied in the range from -0. 1°to 0. 1°. The detection limit of 1.49 × 10 ^-7 rad can be achieved for the detecting system with the resolution of lpm. The simple loop-structure fiber sensor has potential application prospect in the field of architecture or bridge building with low detection limit and low cost.展开更多
An improved method based on minimum mean square error-short time spectral amplitude (MMSE-STSA) is proposed to cancel background noise in whispered speech. Using the acoustic character of whispered speech, the algor...An improved method based on minimum mean square error-short time spectral amplitude (MMSE-STSA) is proposed to cancel background noise in whispered speech. Using the acoustic character of whispered speech, the algorithm can track the change of non-stationary background noise effectively. Compared with original MMSE-STSA algorithm and method in selectable mode Vo-coder (SMV), the improved algorithm can further suppress the residual noise for low signal-to-noise radio (SNR) and avoid the excessive suppression. Simulations show that under the non-stationary noisy environment, the proposed algorithm can not only get a better performance in enhancement, but also reduce the speech distortion.展开更多
We investigate theoretically the single-photon scattering by a A-type three-level system interacting with a whispering-gallery-type resonator which is coupled to a one-dimensional waveguide by full quantum-mechanical ...We investigate theoretically the single-photon scattering by a A-type three-level system interacting with a whispering-gallery-type resonator which is coupled to a one-dimensional waveguide by full quantum-mechanical approach. The single-photon transmission amplitude and reflection amplitude are obtained exactly via real-space approach. The single-photon transport properties controlling by classic optical field are discussed. The critical coupling condition in the coupled waveguide-whispering-gallery resonator-atom with three-level system is also analyzed.展开更多
基金The National Natural Science Foundation of China (No.61231002,61273266,51075068,60872073,60975017, 61003131)the Ph.D.Programs Foundation of the Ministry of Education of China(No.20110092130004)+1 种基金the Science Foundation for Young Talents in the Educational Committee of Anhui Province(No. 2010SQRL018)the 211 Project of Anhui University(No.2009QN027B)
文摘A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize the enhanced whisper. A novel noise robust feature called Gammatone feature cosine coefficients (GFCCs) extracted by an auditory periphery model is derived and used for the binary mask estimation. The intelligibility performance of the proposed method is evaluated and compared with the traditional speech enhancement methods. Objective and subjective evaluation results indicate that the proposed method can effectively improve the intelligibility of whispered speech which is contaminated by noise. Compared with the power subtract algorithm and the log-MMSE algorithm, both of which do not improve the intelligibility in lower signal-to-noise ratio (SNR) environments, the proposed method has good performance in improving the intelligibility of noisy whisper. Additionally, the intelligibility of the enhanced whispered speech using the proposed method also outperforms that of the corresponding unprocessed noisy whispered speech.
文摘本文提出利用级联声光效应器和耦合回音壁模式微球腔的方案来实现非对称传输效果,并进行理论和实验验证.实验中利用加热拉锥的方式制备了两段式光纤,可同时实现声光效应的激发和回音壁模式的耦合.利用光纤中声光效应将纤芯基模中的矢量模式转换到包层高阶模式,由于基模中不同矢量模式转换包层模式的矢量模式也不同,从而产生类似双折射效果,使输出的包层模式产生偏振变化.而后通过耦合回音壁模式微腔将包层模式转换回纤芯基模.由于回音壁模式的偏振选择效果,使得相反方向入射光能量具有不同的透射特性,其传输隔离度可达17 d B.此外,对两个方向传输的透射率随偏振角度变化进行测试,测得声光效应带来的偏振变化约为80°.本文的非对称传输方案继承了声光器件响应迅速、调谐性良好的优势,同时具有全光纤结构和无工作阈值的特点,在光开关、光隔离器等场景具有重要的应用潜力.
基金The National Natural Science Foundation of China(No.61301295,61273266,61301219,61201326,61003131)the Natural Science Foundation of Anhui Province(No.1308085QF100,1408085MF113)+2 种基金the Natural Science Foundation of Jiangsu Province(No.BK20130241)the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.12KJB510021)the Doctoral Fund of Anhui University
文摘Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyzed. Experimental results show that for a spectrum of some density, the joint time-frequency gain-modification based speech enhancement algorithm achieves significant improvement in intelligibility. Additionally, the spectrum region where the estimated spectrum is smaller than the clean spectrum, is the most important region contributing to intelligibility improvement for the enhanced whisper. The spectrum region where the estimated spectrum is larger than twice the size of the clean spectrum is detrimental to speech intelligibility perception within the whisper context.
基金The National Natural Science Foundation of China(No.11401412)
文摘The cognitive performance-based dimensional emotion recognition in whispered speech is studied.First,the whispered speech emotion databases and data collection methods are compared, and the character of emotion expression in whispered speech is studied,especially the basic types of emotions.Secondly,the emotion features for whispered speech is analyzed,and by reviewing the latest references,the related valence features and the arousal features are provided. The effectiveness of valence and arousal features in whispered speech emotion classification is studied.Finally,the Gaussian mixture model is studied and applied to whispered speech emotion recognition. The cognitive performance is also considered in emotion recognition so that the recognition errors of whispered speech emotion can be corrected.Based on the cognitive scores,the emotion recognition results can be improved.The results show that the formant features are not significantly related to arousal dimension,while the short-term energy features are related to the emotion changes in arousal dimension.Using the cognitive scores,the recognition results can be improved.
文摘During the last decades the whispering gallery mode based sensors have become a prominent solution for label-free sensing of various physical and chemical parameters.At the same time,the widespread utilization of the approach is hindered by the restricted applicability of the known configurations for ambient variations quantification outside the laboratory conditions and their low affordability,where necessity on the spectrally-resolved data collection is among the main limiting factors.In this paper we demonstrate the first realization of an affordable whispering gallery mode sensor powered by deep learning and multi-resonator imaging at a fixed frequency.It has been shown that the approach enables refractive index unit(RIU)prediction with an absolute error at 3×10^(-6) level for dynamic range of the RIU variations from 0 to 2×10^(-3) with temporal resolution of several milliseconds and instrument-driven detection limit of 3×10−5.High sensing accuracy together with instrumental affordability and production simplicity places the reported detector among the most cost-effective realizations of the whispering gallery mode approach.The proposed solution is expected to have a great impact on the shift of the whole sensing paradigm away from the model-based and to the flexible self-learning solutions.
基金Supported by the National Basic Research Program of China ( "973" Program) ( 2011 CB013000 ) the National Natural Sci- ence Foundation of China (NSFC) ( 90923039 51105038)
文摘A simple fiber sensor to measure angular displacement with high resolution, which is based on whispering gallery mode (WGM) resonance in bent optical fibers,is proposed. The sensor is composed of a single loop formed by loosely tying a knot using single mode fiber. To measure the transmission spectra, a tunable laser and an optic power meter are connected to the two ends of fi- ber loop, respectively. Significant WGM resonances occur over the investigated wavelength range for all the sensors with different bend radius. The angular-displacement sensitivity is studied in the range from -0. 1°to 0. 1°. The detection limit of 1.49 × 10 ^-7 rad can be achieved for the detecting system with the resolution of lpm. The simple loop-structure fiber sensor has potential application prospect in the field of architecture or bridge building with low detection limit and low cost.
文摘An improved method based on minimum mean square error-short time spectral amplitude (MMSE-STSA) is proposed to cancel background noise in whispered speech. Using the acoustic character of whispered speech, the algorithm can track the change of non-stationary background noise effectively. Compared with original MMSE-STSA algorithm and method in selectable mode Vo-coder (SMV), the improved algorithm can further suppress the residual noise for low signal-to-noise radio (SNR) and avoid the excessive suppression. Simulations show that under the non-stationary noisy environment, the proposed algorithm can not only get a better performance in enhancement, but also reduce the speech distortion.
基金*Supported by National Natural Science Foundation of China under Grant Nos. 10874134, 11004001, and 10947115 and Anhui Province for Young Teachers Foundation under Crant No. 2010SQRL037ZD
文摘We investigate theoretically the single-photon scattering by a A-type three-level system interacting with a whispering-gallery-type resonator which is coupled to a one-dimensional waveguide by full quantum-mechanical approach. The single-photon transmission amplitude and reflection amplitude are obtained exactly via real-space approach. The single-photon transport properties controlling by classic optical field are discussed. The critical coupling condition in the coupled waveguide-whispering-gallery resonator-atom with three-level system is also analyzed.