Two gain forms of spectral amplitude subtraction are derived theoretically without neglecting the correlation of speech and noise spectrum during the period of a fralne. In the implementation, the constrained gain is ...Two gain forms of spectral amplitude subtraction are derived theoretically without neglecting the correlation of speech and noise spectrum during the period of a fralne. In the implementation, the constrained gain is expressed as a function of noncausal a priori SNR (Signal-to-Noise Ratio). Noise and noncausal a priori SNR are estimated from the multitaper spectrum of the noisy signal with algorithms modified to be suitable for the multitaper spectruln. Objective evaluations show that in case of white Gaussian noise the proposed method outperforms some methods based on LSA (Log Spectral Amplitude) in terms of MBSD (Modified Bark Spectral Distortion), segmental SNR and overall SNR, and informal listening tests show that speech reconstructed in this way has little speech distortion and musical noise is nearly inaudible even at low SNR.展开更多
Spectral subtraction is used in this research as a method to remove noise from noisy speech signals in the frequency domain. This method consists of computing the spectrum of the noisy speech using the Fast Fourier Tr...Spectral subtraction is used in this research as a method to remove noise from noisy speech signals in the frequency domain. This method consists of computing the spectrum of the noisy speech using the Fast Fourier Transform (FFT) and subtracting the average magnitude of the noise spectrum from the noisy speech spectrum. We applied spectral subtraction to the speech signal “Real graph”. A digital audio recorder system embedded in a personal computer was used to sample the speech signal “Real graph” to which we digitally added vacuum cleaner noise. The noise removal algorithm was implemented using Matlab software by storing the noisy speech data into Hanning time-widowed half-overlapped data buffers, computing the corresponding spectrums using the FFT, removing the noise from the noisy speech, and reconstructing the speech back into the time domain using the inverse Fast Fourier Transform (IFFT). The performance of the algorithm was evaluated by calculating the Speech to Noise Ratio (SNR). Frame averaging was introduced as an optional technique that could improve the SNR. Seventeen different configurations with various lengths of the Hanning time windows, various degrees of data buffers overlapping, and various numbers of frames to be averaged were investigated in view of improving the SNR. Results showed that using one-fourth overlapped data buffers with 128 points Hanning windows and no frames averaging leads to the best performance in removing noise from the noisy speech.展开更多
Aiming at the problem of music noise introduced by classical spectral subtraction,a shorttime modulation domain(STM)spectral subtraction method has been successfully applied for singlechannel speech enhancement.Howeve...Aiming at the problem of music noise introduced by classical spectral subtraction,a shorttime modulation domain(STM)spectral subtraction method has been successfully applied for singlechannel speech enhancement.However,due to the inaccurate voice activity detection(VAD),the residual music noise and enhanced performance still need to be further improved,especially in the low signal to noise ratio(SNR)scenarios.To address this issue,an improved frame iterative spectral subtraction in the STM domain(IMModSSub)is proposed.More specifically,with the inter-frame correlation,the noise subtraction is directly applied to handle the noisy signal for each frame in the STM domain.Then,the noisy signal is classified into speech or silence frames based on a predefined threshold of segmented SNR.With these classification results,a corresponding mask function is developed for noisy speech after noise subtraction.Finally,exploiting the increased sparsity of speech signal in the modulation domain,the orthogonal matching pursuit(OMP)technique is employed to the speech frames for improving the speech quality and intelligibility.The effectiveness of the proposed method is evaluated with three types of noise,including white noise,pink noise,and hfchannel noise.The obtained results show that the proposed method outperforms some established baselines at lower SNRs(-5 to +5 dB).展开更多
This paper addresses the problem of single-channel speech enhancement in the adverse environment. The critical-band rate scale based on improved multi-band spectral subtraction is investigated in this study for enhanc...This paper addresses the problem of single-channel speech enhancement in the adverse environment. The critical-band rate scale based on improved multi-band spectral subtraction is investigated in this study for enhancement of single-channel speech. In this work, the whole speech spectrum is divided into different non-uniformly spaced frequency bands in accordance with the critical-band rate scale of the psycho-acoustic model and the spectral over-subtraction is carried-out separately in each band. In addition, for the estimation of the noise from each band, the adaptive noise estimation approach is used and does not require explicit speech silence detection. The noise is estimated and updated by adaptively smoothing the noisy signal power in each band. The smoothing parameter is controlled by a-posteriori signal-to-noise ratio (SNR). For the performance analysis of the proposed algorithm, the objective measures, such as, SNR, segmental SNR, and perceptual evaluations of the speech quality are conducted for the variety of noises at different levels of SNRs. The speech spectrogram and objective evaluations of the proposed algorithm are compared with other standard speech enhancement algorithms and proved that the musical structure of the remnant noise and background noise is better suppressed by the proposed algorithm.展开更多
AIM: To assess the value of gemstone spectral imaging (GSI) in efficacy evaluation in hepatocellular cancer (HCC) after transcatheter arterial chemoembolization (TACE) treatment.METHODS: Thirty patients with HCC under...AIM: To assess the value of gemstone spectral imaging (GSI) in efficacy evaluation in hepatocellular cancer (HCC) after transcatheter arterial chemoembolization (TACE) treatment.METHODS: Thirty patients with HCC underwent GSI, including nonenhanced, arterial, portalvenous and delayed phase scans, after TACE treatment. Arterial phase images were acquired with GSI for reconstruction of virtual nonenhanced images and color overlay images. Digital subtraction angiography (DSA) was performed in all these patients. Two blinded and independent readers evaluated the data in two reading sessions; standard nonenhanced, arterial, portalvenous, and delayed phase images were read in session A, and the optimal monochromatic images, iodine/water based images and spectrum features were read in session B. Sensitivity and specificity were calculated with the DSA data as the reference standard. The sensitivity and specificity were compared using the χ<sup>2</sup> test.RESULTS: DSA revealed 154 lesions in 30 patients, and 100 of them had blood supply. Overall sensitivity and specificity were 72% (72/100) and 77.8% (42/54) for session A, and 97% (97/100) and 94.4% (51/54) for session B, respectively. The sensitivity and specificity of the two reading sessions were significantly different (χ<sup>2</sup> = 23.04, χ<sup>2</sup> = 7.11, P < 0.05).CONCLUSION: Compared with conventional CT, GSI could significantly improve the detection of small and multiple lesions without increasing the radiation dose. Based on spectrum features, GSI could assess tumor homogeneity and more accurately identify residual tumors and recurrent or metastatic lesions during efficacy evaluation and follow-up in HCC after TACE treatment.展开更多
Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram o...Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram of the signal is calculated based on a multi-window T-F analysis,and a speech test statistic is constructed based on the characteristic difference between the signal and background noise.Second,the dynamic double-threshold processing is used for preliminary detection,and then the global double-threshold value is obtained using K-means clustering.Finally,the detection results are obtained by sequential decision.The experimental results show that the overall performance of the method is better than that of traditional methods under various SNR conditions and background noises.This method also has the advantages of low complexity,strong robustness,and adaptability to multi-national languages.展开更多
基金Supported by 973 Project of China (No.2002 CB312102)and the National Natural Science Foundation of China (No.60272044).
文摘Two gain forms of spectral amplitude subtraction are derived theoretically without neglecting the correlation of speech and noise spectrum during the period of a fralne. In the implementation, the constrained gain is expressed as a function of noncausal a priori SNR (Signal-to-Noise Ratio). Noise and noncausal a priori SNR are estimated from the multitaper spectrum of the noisy signal with algorithms modified to be suitable for the multitaper spectruln. Objective evaluations show that in case of white Gaussian noise the proposed method outperforms some methods based on LSA (Log Spectral Amplitude) in terms of MBSD (Modified Bark Spectral Distortion), segmental SNR and overall SNR, and informal listening tests show that speech reconstructed in this way has little speech distortion and musical noise is nearly inaudible even at low SNR.
文摘Spectral subtraction is used in this research as a method to remove noise from noisy speech signals in the frequency domain. This method consists of computing the spectrum of the noisy speech using the Fast Fourier Transform (FFT) and subtracting the average magnitude of the noise spectrum from the noisy speech spectrum. We applied spectral subtraction to the speech signal “Real graph”. A digital audio recorder system embedded in a personal computer was used to sample the speech signal “Real graph” to which we digitally added vacuum cleaner noise. The noise removal algorithm was implemented using Matlab software by storing the noisy speech data into Hanning time-widowed half-overlapped data buffers, computing the corresponding spectrums using the FFT, removing the noise from the noisy speech, and reconstructing the speech back into the time domain using the inverse Fast Fourier Transform (IFFT). The performance of the algorithm was evaluated by calculating the Speech to Noise Ratio (SNR). Frame averaging was introduced as an optional technique that could improve the SNR. Seventeen different configurations with various lengths of the Hanning time windows, various degrees of data buffers overlapping, and various numbers of frames to be averaged were investigated in view of improving the SNR. Results showed that using one-fourth overlapped data buffers with 128 points Hanning windows and no frames averaging leads to the best performance in removing noise from the noisy speech.
基金National Natural Science Foundation of China(NSFC)(No.61671075)Major Program of National Natural Science Foundation of China(No.61631003)。
文摘Aiming at the problem of music noise introduced by classical spectral subtraction,a shorttime modulation domain(STM)spectral subtraction method has been successfully applied for singlechannel speech enhancement.However,due to the inaccurate voice activity detection(VAD),the residual music noise and enhanced performance still need to be further improved,especially in the low signal to noise ratio(SNR)scenarios.To address this issue,an improved frame iterative spectral subtraction in the STM domain(IMModSSub)is proposed.More specifically,with the inter-frame correlation,the noise subtraction is directly applied to handle the noisy signal for each frame in the STM domain.Then,the noisy signal is classified into speech or silence frames based on a predefined threshold of segmented SNR.With these classification results,a corresponding mask function is developed for noisy speech after noise subtraction.Finally,exploiting the increased sparsity of speech signal in the modulation domain,the orthogonal matching pursuit(OMP)technique is employed to the speech frames for improving the speech quality and intelligibility.The effectiveness of the proposed method is evaluated with three types of noise,including white noise,pink noise,and hfchannel noise.The obtained results show that the proposed method outperforms some established baselines at lower SNRs(-5 to +5 dB).
文摘This paper addresses the problem of single-channel speech enhancement in the adverse environment. The critical-band rate scale based on improved multi-band spectral subtraction is investigated in this study for enhancement of single-channel speech. In this work, the whole speech spectrum is divided into different non-uniformly spaced frequency bands in accordance with the critical-band rate scale of the psycho-acoustic model and the spectral over-subtraction is carried-out separately in each band. In addition, for the estimation of the noise from each band, the adaptive noise estimation approach is used and does not require explicit speech silence detection. The noise is estimated and updated by adaptively smoothing the noisy signal power in each band. The smoothing parameter is controlled by a-posteriori signal-to-noise ratio (SNR). For the performance analysis of the proposed algorithm, the objective measures, such as, SNR, segmental SNR, and perceptual evaluations of the speech quality are conducted for the variety of noises at different levels of SNRs. The speech spectrogram and objective evaluations of the proposed algorithm are compared with other standard speech enhancement algorithms and proved that the musical structure of the remnant noise and background noise is better suppressed by the proposed algorithm.
文摘AIM: To assess the value of gemstone spectral imaging (GSI) in efficacy evaluation in hepatocellular cancer (HCC) after transcatheter arterial chemoembolization (TACE) treatment.METHODS: Thirty patients with HCC underwent GSI, including nonenhanced, arterial, portalvenous and delayed phase scans, after TACE treatment. Arterial phase images were acquired with GSI for reconstruction of virtual nonenhanced images and color overlay images. Digital subtraction angiography (DSA) was performed in all these patients. Two blinded and independent readers evaluated the data in two reading sessions; standard nonenhanced, arterial, portalvenous, and delayed phase images were read in session A, and the optimal monochromatic images, iodine/water based images and spectrum features were read in session B. Sensitivity and specificity were calculated with the DSA data as the reference standard. The sensitivity and specificity were compared using the χ<sup>2</sup> test.RESULTS: DSA revealed 154 lesions in 30 patients, and 100 of them had blood supply. Overall sensitivity and specificity were 72% (72/100) and 77.8% (42/54) for session A, and 97% (97/100) and 94.4% (51/54) for session B, respectively. The sensitivity and specificity of the two reading sessions were significantly different (χ<sup>2</sup> = 23.04, χ<sup>2</sup> = 7.11, P < 0.05).CONCLUSION: Compared with conventional CT, GSI could significantly improve the detection of small and multiple lesions without increasing the radiation dose. Based on spectrum features, GSI could assess tumor homogeneity and more accurately identify residual tumors and recurrent or metastatic lesions during efficacy evaluation and follow-up in HCC after TACE treatment.
基金The National Natural Science Foundation of China(No.12174053,91938203,11674057,11874109)the Fundamental Research Funds for the Central Universities(No.2242021k30019).
文摘Aiming at the poor performance of speech signal detection at low signal-to-noise ratio(SNR),a method is proposed to detect active speech frames based on multi-window time-frequency(T-F)diagrams.First,the T-F diagram of the signal is calculated based on a multi-window T-F analysis,and a speech test statistic is constructed based on the characteristic difference between the signal and background noise.Second,the dynamic double-threshold processing is used for preliminary detection,and then the global double-threshold value is obtained using K-means clustering.Finally,the detection results are obtained by sequential decision.The experimental results show that the overall performance of the method is better than that of traditional methods under various SNR conditions and background noises.This method also has the advantages of low complexity,strong robustness,and adaptability to multi-national languages.