A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize...A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize the enhanced whisper. A novel noise robust feature called Gammatone feature cosine coefficients (GFCCs) extracted by an auditory periphery model is derived and used for the binary mask estimation. The intelligibility performance of the proposed method is evaluated and compared with the traditional speech enhancement methods. Objective and subjective evaluation results indicate that the proposed method can effectively improve the intelligibility of whispered speech which is contaminated by noise. Compared with the power subtract algorithm and the log-MMSE algorithm, both of which do not improve the intelligibility in lower signal-to-noise ratio (SNR) environments, the proposed method has good performance in improving the intelligibility of noisy whisper. Additionally, the intelligibility of the enhanced whispered speech using the proposed method also outperforms that of the corresponding unprocessed noisy whispered speech.展开更多
Speech intelligibility enhancement in noisy environments is still one of the major challenges for hearing impaired in everyday life.Recently,Machine-learning based approaches to speech enhancement have shown great pro...Speech intelligibility enhancement in noisy environments is still one of the major challenges for hearing impaired in everyday life.Recently,Machine-learning based approaches to speech enhancement have shown great promise for improving speech intelligibility.Two key issues of these approaches are acoustic features extracted from noisy signals and classifiers used for supervised learning.In this paper,features are focused.Multi-resolution power-normalized cepstral coefficients(MRPNCC)are proposed as a new feature to enhance the speech intelligibility for hearing impaired.The new feature is constructed by combining four cepstrum at different time–frequency(T–F)resolutions in order to capture both the local and contextual information.MRPNCC vectors and binary masking labels calculated by signals passed through gammatone filterbank are used to train support vector machine(SVM)classifier,which aim to identify the binary masking values of the T–F units in the enhancement stage.The enhanced speech is synthesized by using the estimated masking values and wiener filtered T–F unit.Objective experimental results demonstrate that the proposed feature is superior to other comparing features in terms of HIT-FA,STOI,HASPI and PESQ,and that the proposed algorithm not only improves speech intelligibility but also improves speech quality slightly.Subjective tests validate the effectiveness of the proposed algorithm for hearing impaired.展开更多
In certain environments and under some conditions, the video images taken by the intelligent mobile video phones seem dark, and the colors are not bright or saturated enough.This paper presents an adaptive method to e...In certain environments and under some conditions, the video images taken by the intelligent mobile video phones seem dark, and the colors are not bright or saturated enough.This paper presents an adaptive method to enhance the video image brightness visualization and the color performance depending on the certain hardware property and function parameters. The experimental results prove that this method can enhance the colors and the contrast of the video images, based on the estimated quality feature values of each frame, without using the extra Digital Signal Processor (DSP).展开更多
Recent advances in endoscopic imaging techniques have revolutionized the diagnostic approach of patients with inflammatory bowel disease(IBD).New,emerging endoscopic imaging techniques visualized a plethora of new muc...Recent advances in endoscopic imaging techniques have revolutionized the diagnostic approach of patients with inflammatory bowel disease(IBD).New,emerging endoscopic imaging techniques visualized a plethora of new mucosal details even at the cellular and subcellular level.This review offers an overview about new endoscopic techniques,including chromoendoscopy,magnification endoscopy,spectroscopy,confocal laser endomicroscopy and endocytoscopy in the face of IBD.展开更多
Increase in permeability of renewable energy sources(RESs)leads to the prominent problem of voltage stability in power system,so it is urgent to have a system strength evaluation method with both accuracy and practica...Increase in permeability of renewable energy sources(RESs)leads to the prominent problem of voltage stability in power system,so it is urgent to have a system strength evaluation method with both accuracy and practicability to control its access scale within a reasonable range.Therefore,a hybrid intelligence enhancement method is proposed by combining the advantages of mechanism method and data driven method.First,calculation of critical short circuit ratio(CSCR)is set as the direction of intelligent enhancement by taking the multiple renewable energy station short circuit ratio as the quantitative indicator.Then,the construction process of CSCR dataset is proposed,and a batch simulation program of samples is developed accordingly,which provides a data basis for subsequent research.Finally,a multi-task learning model based on progressive layered extraction is used to simultaneously predict CSCR of each RESs connection point,which significantly reduces evaluation error caused by weak links.Predictive performance and anti-noise performance of the proposed method are verified on the CEPRI-FS-102 bus system,which provides strong technical support for real-time monitoring of system strength.展开更多
基金The National Natural Science Foundation of China (No.61231002,61273266,51075068,60872073,60975017, 61003131)the Ph.D.Programs Foundation of the Ministry of Education of China(No.20110092130004)+1 种基金the Science Foundation for Young Talents in the Educational Committee of Anhui Province(No. 2010SQRL018)the 211 Project of Anhui University(No.2009QN027B)
文摘A machine learning based speech enhancement method is proposed to improve the intelligibility of whispered speech. A binary mask estimated by a two-class support vector machine (SVM) classifier is used to synthesize the enhanced whisper. A novel noise robust feature called Gammatone feature cosine coefficients (GFCCs) extracted by an auditory periphery model is derived and used for the binary mask estimation. The intelligibility performance of the proposed method is evaluated and compared with the traditional speech enhancement methods. Objective and subjective evaluation results indicate that the proposed method can effectively improve the intelligibility of whispered speech which is contaminated by noise. Compared with the power subtract algorithm and the log-MMSE algorithm, both of which do not improve the intelligibility in lower signal-to-noise ratio (SNR) environments, the proposed method has good performance in improving the intelligibility of noisy whisper. Additionally, the intelligibility of the enhanced whispered speech using the proposed method also outperforms that of the corresponding unprocessed noisy whispered speech.
基金supported by the National Natural Science Foundation of China(Nos.61902158,61673108)the Science and Technology Program of Nantong(JC2018129,MS12018082)Top-notch Academic Programs Project of Jiangsu Higher Education Institu-tions(PPZY2015B135).
文摘Speech intelligibility enhancement in noisy environments is still one of the major challenges for hearing impaired in everyday life.Recently,Machine-learning based approaches to speech enhancement have shown great promise for improving speech intelligibility.Two key issues of these approaches are acoustic features extracted from noisy signals and classifiers used for supervised learning.In this paper,features are focused.Multi-resolution power-normalized cepstral coefficients(MRPNCC)are proposed as a new feature to enhance the speech intelligibility for hearing impaired.The new feature is constructed by combining four cepstrum at different time–frequency(T–F)resolutions in order to capture both the local and contextual information.MRPNCC vectors and binary masking labels calculated by signals passed through gammatone filterbank are used to train support vector machine(SVM)classifier,which aim to identify the binary masking values of the T–F units in the enhancement stage.The enhanced speech is synthesized by using the estimated masking values and wiener filtered T–F unit.Objective experimental results demonstrate that the proposed feature is superior to other comparing features in terms of HIT-FA,STOI,HASPI and PESQ,and that the proposed algorithm not only improves speech intelligibility but also improves speech quality slightly.Subjective tests validate the effectiveness of the proposed algorithm for hearing impaired.
文摘In certain environments and under some conditions, the video images taken by the intelligent mobile video phones seem dark, and the colors are not bright or saturated enough.This paper presents an adaptive method to enhance the video image brightness visualization and the color performance depending on the certain hardware property and function parameters. The experimental results prove that this method can enhance the colors and the contrast of the video images, based on the estimated quality feature values of each frame, without using the extra Digital Signal Processor (DSP).
文摘Recent advances in endoscopic imaging techniques have revolutionized the diagnostic approach of patients with inflammatory bowel disease(IBD).New,emerging endoscopic imaging techniques visualized a plethora of new mucosal details even at the cellular and subcellular level.This review offers an overview about new endoscopic techniques,including chromoendoscopy,magnification endoscopy,spectroscopy,confocal laser endomicroscopy and endocytoscopy in the face of IBD.
文摘Increase in permeability of renewable energy sources(RESs)leads to the prominent problem of voltage stability in power system,so it is urgent to have a system strength evaluation method with both accuracy and practicability to control its access scale within a reasonable range.Therefore,a hybrid intelligence enhancement method is proposed by combining the advantages of mechanism method and data driven method.First,calculation of critical short circuit ratio(CSCR)is set as the direction of intelligent enhancement by taking the multiple renewable energy station short circuit ratio as the quantitative indicator.Then,the construction process of CSCR dataset is proposed,and a batch simulation program of samples is developed accordingly,which provides a data basis for subsequent research.Finally,a multi-task learning model based on progressive layered extraction is used to simultaneously predict CSCR of each RESs connection point,which significantly reduces evaluation error caused by weak links.Predictive performance and anti-noise performance of the proposed method are verified on the CEPRI-FS-102 bus system,which provides strong technical support for real-time monitoring of system strength.