期刊文献+
共找到1,207篇文章
< 1 2 61 >
每页显示 20 50 100
Real-Time Speech Enhancement Based on Convolutional Recurrent Neural Network
1
作者 S.Girirajan A.Pandian 《Intelligent Automation & Soft Computing》 SCIE 2023年第2期1987-2001,共15页
Speech enhancement is the task of taking a noisy speech input and pro-ducing an enhanced speech output.In recent years,the need for speech enhance-ment has been increased due to challenges that occurred in various app... Speech enhancement is the task of taking a noisy speech input and pro-ducing an enhanced speech output.In recent years,the need for speech enhance-ment has been increased due to challenges that occurred in various applications such as hearing aids,Automatic Speech Recognition(ASR),and mobile speech communication systems.Most of the Speech Enhancement research work has been carried out for English,Chinese,and other European languages.Only a few research works involve speech enhancement in Indian regional Languages.In this paper,we propose a two-fold architecture to perform speech enhancement for Tamil speech signal based on convolutional recurrent neural network(CRN)that addresses the speech enhancement in a real-time single channel or track of sound created by the speaker.In thefirst stage mask based long short-term mem-ory(LSTM)is used for noise suppression along with loss function and in the sec-ond stage,Convolutional Encoder-Decoder(CED)is used for speech restoration.The proposed model is evaluated on various speaker and noisy environments like Babble noise,car noise,and white Gaussian noise.The proposed CRN model improves speech quality by 0.1 points when compared with the LSTM base model and also CRN requires fewer parameters for training.The performance of the pro-posed model is outstanding even in low Signal to Noise Ratio(SNR). 展开更多
关键词 speech enhancement convolutional encoder-decoder long short-term memory noise suppression speech restoration
下载PDF
Speech Enhancement via Mask-Mapping Based Residual Dense Network
2
作者 Lin Zhou Xijin Chen +3 位作者 Chaoyan Wu Qiuyue Zhong Xu Cheng Yibin Tang 《Computers, Materials & Continua》 SCIE EI 2023年第1期1259-1277,共19页
Masking-based and spectrum mapping-based methods are the two main algorithms of speech enhancement with deep neural network(DNN).But the mapping-based methods only utilizes the phase of noisy speech,which limits the u... Masking-based and spectrum mapping-based methods are the two main algorithms of speech enhancement with deep neural network(DNN).But the mapping-based methods only utilizes the phase of noisy speech,which limits the upper bound of speech enhancement performance.Maskingbased methods need to accurately estimate the masking which is still the key problem.Combining the advantages of above two types of methods,this paper proposes the speech enhancement algorithm MM-RDN(maskingmapping residual dense network)based on masking-mapping(MM)and residual dense network(RDN).Using the logarithmic power spectrogram(LPS)of consecutive frames,MM estimates the ideal ratio masking(IRM)matrix of consecutive frames.RDN can make full use of feature maps of all layers.Meanwhile,using the global residual learning to combine the shallow features and deep features,RDN obtains the global dense features from the LPS,thereby improves estimated accuracy of the IRM matrix.Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments.Specifically,in the untrained acoustic test with limited priors,e.g.,unmatched signal-to-noise ratio(SNR)and unmatched noise category,MM-RDN can still outperform the existing convolutional recurrent network(CRN)method in themeasures of perceptual evaluation of speech quality(PESQ)and other evaluation indexes.It indicates that the proposed algorithm is more generalized in untrained conditions. 展开更多
关键词 Mask-mapping-based method residual dense block speech enhancement
下载PDF
Using Hybrid Penalty and Gated Linear Units to Improve Wasserstein Generative Adversarial Networks for Single-Channel Speech Enhancement
3
作者 Xiaojun Zhu Heming Huang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第6期2155-2172,共18页
Recently,speech enhancement methods based on Generative Adversarial Networks have achieved good performance in time-domain noisy signals.However,the training of Generative Adversarial Networks has such problems as con... Recently,speech enhancement methods based on Generative Adversarial Networks have achieved good performance in time-domain noisy signals.However,the training of Generative Adversarial Networks has such problems as convergence difficulty,model collapse,etc.In this work,an end-to-end speech enhancement model based on Wasserstein Generative Adversarial Networks is proposed,and some improvements have been made in order to get faster convergence speed and better generated speech quality.Specifically,in the generator coding part,each convolution layer adopts different convolution kernel sizes to conduct convolution operations for obtaining speech coding information from multiple scales;a gated linear unit is introduced to alleviate the vanishing gradient problem with the increase of network depth;the gradient penalty of the discriminator is replaced with spectral normalization to accelerate the convergence rate of themodel;a hybrid penalty termcomposed of L1 regularization and a scale-invariant signal-to-distortion ratio is introduced into the loss function of the generator to improve the quality of generated speech.The experimental results on both TIMIT corpus and Tibetan corpus show that the proposed model improves the speech quality significantly and accelerates the convergence speed of the model. 展开更多
关键词 speech enhancement generative adversarial networks hybrid penalty gated linear units multi-scale convolution
下载PDF
Adversarial Examples Protect Your Privacy on Speech Enhancement System
4
作者 Mingyu Dong Diqun Yan Rangding Wang 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期1-12,共12页
Speech is easily leaked imperceptibly.When people use their phones,the personal voice assistant is constantly listening and waiting to be activated.Private content in speech may be maliciously extracted through automa... Speech is easily leaked imperceptibly.When people use their phones,the personal voice assistant is constantly listening and waiting to be activated.Private content in speech may be maliciously extracted through automatic speech recognition(ASR)technology by some applications on phone devices.To guarantee that the recognized speech content is accurate,speech enhancement technology is used to denoise the input speech.Speech enhancement technology has developed rapidly along with deep neural networks(DNNs),but adversarial examples can cause DNNs to fail.Considering that the vulnerability of DNN can be used to protect the privacy in speech.In this work,we propose an adversarial method to degrade speech enhancement systems,which can prevent the malicious extraction of private information in speech.Experimental results show that the generated enhanced adversarial examples can be removed most content of the target speech or replaced with target speech content by speech enhancement.The word error rate(WER)between the enhanced original example and enhanced adversarial example recognition result can reach 89.0%.WER of target attack between enhanced adversarial example and target example is low at 33.75%.The adversarial perturbation in the adversarial example can bring much more change than itself.The rate of difference between two enhanced examples and adversarial perturbation can reach more than 1.4430.Meanwhile,the transferability between different speech enhancement models is also investigated.The low transferability of the method can be used to ensure the content in the adversarial example is not damaged,the useful information can be extracted by the friendly ASR.This work can prevent the malicious extraction of speech. 展开更多
关键词 Adversarial example speech enhancement privacy protection deep neural network
下载PDF
Speech enhancement based on leakage constraints DF-GSC 被引量:1
5
作者 邹采荣 陈国明 赵力 《Journal of Southeast University(English Edition)》 EI CAS 2007年第4期507-511,共5页
In order to improve the performance of general sidelobe canceller (GSC) based speech enhancement, a leakage constraints decision feedback generalized sidelobe canceller(LCDF-GSC) algorithm is proposed. The method ... In order to improve the performance of general sidelobe canceller (GSC) based speech enhancement, a leakage constraints decision feedback generalized sidelobe canceller(LCDF-GSC) algorithm is proposed. The method adopts DF-GSC against signal mismatch, and introduces a leakage factor in the cost function to deal with the speech leakage problem which is caused by the part of the speech signal in the noise reference signal. Simulation results show that although the signal-to-noise ratio (SNR) of the speech signal through LCDF-GSC is slightly less than that of DF-GSC, the IS measurements show that the distortion of the former is less than that of the latter. MOS (mean opinion score) scores also indicate that the LCDF-GSC algorithm is better than DF- GSC and the Weiner filter algorithm, 展开更多
关键词 speech enhancement general sidelobe canceller (GSC) speech leakage
下载PDF
A speech enhancement algorithm to reduce noise and compensate for partial masking effect 被引量:4
6
作者 JEON Yu-yong LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2011年第4期1121-1127,共7页
To enhance the speech quality that is degraded by environmental noise,an algorithm was proposed to reduce the noise and reinforce the speech.The minima controlled recursive averaging(MCRA) algorithm was used to estima... To enhance the speech quality that is degraded by environmental noise,an algorithm was proposed to reduce the noise and reinforce the speech.The minima controlled recursive averaging(MCRA) algorithm was used to estimate the noise spectrum and the partial masking effect which is one of the psychoacoustic properties was introduced to reinforce speech.The performance evaluation was performed by comparing the PESQ(perceptual evaluation of speech quality) and segSNR(segmental signal to noise ratio) by the proposed algorithm with the conventional algorithm.As a result,average PESQ by the proposed algorithm was higher than the average PESQ by the conventional noise reduction algorithm and segSNR was higher as much as 3.2 dB in average than that of the noise reduction algorithm. 展开更多
关键词 speech enhancement noise reduction psychoacoustic property human hearing property
下载PDF
SPEECH ENHANCEMENT USING AN MMSE SHORT TIME DCT COEFFICIENTS ESTIMATOR WITH SUPERGAUSSIAN SPEECH MODELING 被引量:4
7
作者 Zou Xia Zhang Xiongwei 《Journal of Electronics(China)》 2007年第3期332-337,共6页
In this paper,two speech enhancement systems with supergaussian speech modeling are presented. The clean speech components are estimated by Minimum-Mean-Square-Error (MMSE) es-timator under the assumption that the DCT... In this paper,two speech enhancement systems with supergaussian speech modeling are presented. The clean speech components are estimated by Minimum-Mean-Square-Error (MMSE) es-timator under the assumption that the DCT coefficients of clean speech are modeled by a Laplacian or a Gamma distribution and the DCT coefficients of the noise are Gaussian distributed. Then,MMSE estimators under speech presence uncertainty are derived. Furthermore,the proper estimators of the speech statistical parameters are proposed. The speech Laplacian factor is estimated by a new deci-sion-directed method. The simulation results show that the proposed algorithm yields less residual noise and better speech quality than the Gaussian based speech enhancement algorithms proposed in recent years. 展开更多
关键词 speech enhancement speech model Minimum-Mean-Square-Error (MMSE) Super Ganssian
下载PDF
A continuous differentiable wavelet threshold function for speech enhancement 被引量:3
8
作者 贾海蓉 张雪英 白静 《Journal of Central South University》 SCIE EI CAS 2013年第8期2219-2225,共7页
Enhanced speech based on the traditional wavelet threshold function had auditory oscillation distortion and the low signal-to-noise ratio (SNR). In order to solve these problems, a new continuous differentiable thresh... Enhanced speech based on the traditional wavelet threshold function had auditory oscillation distortion and the low signal-to-noise ratio (SNR). In order to solve these problems, a new continuous differentiable threshold function for speech enhancement was presented. Firstly, the function adopted narrow threshold areas, preserved the smaller signal speech, and improved the speech quality; secondly, based on the properties of the continuous differentiable and non-fixed deviation, each area function was attained gradually by using the method of mathematical derivation. It ensured that enhanced speech was continuous and smooth; it removed the auditory oscillation distortion; finally, combined with the Bark wavelet packets, it further improved human auditory perception. Experimental results show that the segmental SNR and PESQ (perceptual evaluation of speech quality) of the enhanced speech using this method increase effectively, compared with the existing speech enhancement algorithms based on wavelet threshold. 展开更多
关键词 continuous differentiable wavelet threshold fimction speech enhancement Bark wavelet packet non-fixed deviation noise
下载PDF
Speech enhancement through voice activity detection using speech absence probability based on Teager energy 被引量:2
9
作者 PARKYun-sik LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2013年第2期424-432,共9页
In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (... In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (LSAP) based on the TE of noisy speech as a feature parameter for voice activity detection (VAD) in each frequency subband, rather than conventional LSAP. Results show that the TE operator can enhance the abiTity to discriminate speech and noise and further suppress noise components. Therefore, TE-based LSAP provides a better representation of LSAP, resulting in improved VAD for estimating noise power in a speech enhancement algorithm. In addition, the presented method utilizes TE-based global SAP (GSAP) derived in each frame as the weighting parameter for modifying the adopted TE operator and improving its performance. The proposed algorithm was evaluated by objective and subjective quality tests under various environments, and was shown to produce better results than the conventional method. 展开更多
关键词 speech enhancement Teager energy speech absence probability voice activity detection
下载PDF
Mobile Communication Voice Enhancement Under Convolutional Neural Networks and the Internet of Things
10
作者 Jiajia Yu 《Intelligent Automation & Soft Computing》 SCIE 2023年第7期777-797,共21页
This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered ... This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered by mobile communication.First,the principles and techniques of speech enhancement are analyzed,and a fast lateral recursive least square method(FLRLS method)is adopted to process sound data.Then,the convolutional neural networks(CNNs)-based noise recognition CNN(NR-CNN)algorithm and speech enhancement model are proposed.Finally,related experiments are designed to verify the performance of the proposed algorithm and model.The experimental results show that the noise classification accuracy of the NR-CNN noise recognition algorithm is higher than 99.82%,and the recall rate and F1 value are also higher than 99.92.The proposed sound enhance-ment model can effectively enhance the original sound in the case of noise interference.After the CNN is incorporated,the average value of all noisy sound perception quality evaluation system values is improved by over 21%compared with that of the traditional noise reduction method.The proposed algorithm can adapt to a variety of voice environments and can simultaneously enhance and reduce noise processing on a variety of different types of voice signals,and the processing effect is better than that of traditional sound enhancement models.In addition,the sound distortion index of the proposed speech enhancement model is inferior to that of the control group,indicating that the addition of the CNN neural network is less likely to cause sound signal distortion in various sound environments and shows superior robustness.In summary,the proposed CNN-based speech enhancement model shows significant sound enhancement effects,stable performance,and strong adapt-ability.This study provides a reference and basis for research applying neural networks in speech enhancement. 展开更多
关键词 Convolutional neural networks speech enhancement noise recognition deep learning human-computer interaction Internet of Things
下载PDF
Enhanced Frequency-Domain Frost Algorithm Using Conjugate Gradient Techniques for Speech Enhancement 被引量:1
11
作者 Shengkui Zhao Douglas L. Jones 《Journal of Electronic Science and Technology》 CAS 2012年第2期158-162,共5页
In this paper, the frequency-domain Frost algorithm is enhanced by using conjugate gradient techniques for speech enhancement. Unlike the non-adaptive approach of computing the optimum minimum variance distortionless ... In this paper, the frequency-domain Frost algorithm is enhanced by using conjugate gradient techniques for speech enhancement. Unlike the non-adaptive approach of computing the optimum minimum variance distortionless response (MVDR) solution with the correlation matrix inversion, the Frost algorithm implementing the stochastic constrained least mean square (LMS) algorithm can adaptively converge to the MVDR solution in mean-square sense, but with a very slow convergence rate. In this paper, we propose a frequency-domain constrained conjugate gradient (FDCCG) algorithm to speed up the convergence. The devised FDCCG algorithm avoids the matrix inversion and exhibits fast convergence. The speech enhancement experiments for the target speech signal corrupted by two and five interfering speech signals are demonstrated by using a four-channel acoustic-vector-sensor (AVS) micro-phone array and show the superior performance. 展开更多
关键词 Adaptive gence correlation speech arrays. signal processing conver- enhancement MICROPHONE
下载PDF
Speech Enhancement Based on Approximate Message Passing 被引量:1
12
作者 Chao Li Ting Jiang Sheng Wu 《China Communications》 SCIE CSCD 2020年第8期187-198,共12页
To overcome the limitations of conventional speech enhancement methods, such as inaccurate voice activity detector(VAD) and noise estimation, a novel speech enhancement algorithm based on the approximate message passi... To overcome the limitations of conventional speech enhancement methods, such as inaccurate voice activity detector(VAD) and noise estimation, a novel speech enhancement algorithm based on the approximate message passing(AMP) is adopted. AMP exploits the difference between speech and noise sparsity to remove or mute the noise from the corrupted speech. The AMP algorithm is adopted to reconstruct the clean speech efficiently for speech enhancement. More specifically, the prior probability distribution of speech sparsity coefficient is characterized by Gaussian-model, and the hyper-parameters of the prior model are excellently learned by expectation maximization(EM) algorithm. We utilize the k-nearest neighbor(k-NN) algorithm to learn the sparsity with the fact that the speech coefficients between adjacent frames are correlated. In addition, computational simulations are used to validate the proposed algorithm, which achieves better speech enhancement performance than other four baseline methods-Wiener filtering, subspace pursuit(SP), distributed sparsity adaptive matching pursuit(DSAMP), and expectation-maximization Gaussian-model approximate message passing(EM-GAMP) under different compression ratios and a wide range of signal to noise ratios(SNRs). 展开更多
关键词 speech enhancement approximate message passing Gaussian model expectation maximization algorithm
下载PDF
Single-Channel Speech Enhancement Based on Improved Frame-Iterative Spectral Subtraction in the Modulation Domain 被引量:1
13
作者 Chao Li Ting Jiang Sheng Wu 《China Communications》 SCIE CSCD 2021年第9期100-115,共16页
Aiming at the problem of music noise introduced by classical spectral subtraction,a shorttime modulation domain(STM)spectral subtraction method has been successfully applied for singlechannel speech enhancement.Howeve... Aiming at the problem of music noise introduced by classical spectral subtraction,a shorttime modulation domain(STM)spectral subtraction method has been successfully applied for singlechannel speech enhancement.However,due to the inaccurate voice activity detection(VAD),the residual music noise and enhanced performance still need to be further improved,especially in the low signal to noise ratio(SNR)scenarios.To address this issue,an improved frame iterative spectral subtraction in the STM domain(IMModSSub)is proposed.More specifically,with the inter-frame correlation,the noise subtraction is directly applied to handle the noisy signal for each frame in the STM domain.Then,the noisy signal is classified into speech or silence frames based on a predefined threshold of segmented SNR.With these classification results,a corresponding mask function is developed for noisy speech after noise subtraction.Finally,exploiting the increased sparsity of speech signal in the modulation domain,the orthogonal matching pursuit(OMP)technique is employed to the speech frames for improving the speech quality and intelligibility.The effectiveness of the proposed method is evaluated with three types of noise,including white noise,pink noise,and hfchannel noise.The obtained results show that the proposed method outperforms some established baselines at lower SNRs(-5 to +5 dB). 展开更多
关键词 short-time modulation domain single-channel speech enhancement modulation improved frame iterative spectral subtraction low SNRs
下载PDF
An Efficient Reference Free Adaptive Learning Process for Speech Enhancement Applications 被引量:1
14
作者 Girika Jyoshna Md.Zia Ur Rahman L.Koteswararao 《Computers, Materials & Continua》 SCIE EI 2022年第2期3067-3080,共14页
In issues like hearing impairment,speech therapy and hearing aids play a major role in reducing the impairment.Removal of noise signals from speech signals is a key task in hearing aids as well as in speech therapy.Du... In issues like hearing impairment,speech therapy and hearing aids play a major role in reducing the impairment.Removal of noise signals from speech signals is a key task in hearing aids as well as in speech therapy.During the transmission of speech signals,several noise components contaminate the actual speech components.This paper addresses a new adaptive speech enhancement(ASE)method based on a modified version of singular spectrum analysis(MSSA).The MSSA generates a reference signal for ASE and makes the ASE is free from feeding reference component.The MSSA adopts three key steps for generating the reference from the contaminated speech only.These are decomposition,grouping and reconstruction.The generated reference is taken as a reference for variable size adaptive learning algorithms.In this work two categories of adaptive learning algorithms are used.They are step variable adaptive learning(SVAL)algorithm and time variable step size adaptive learning(TVAL).Further,sign regressor function is applied to adaptive learning algorithms to reduce the computational complexity of the proposed adaptive learning algorithms.The performance measures of the proposed schemes are calculated in terms of signal to noise ratio improvement(SNRI),excess mean square error(EMSE)and misadjustment(MSD).For cockpit noise these measures are found to be 29.2850,-27.6060 and 0.0758 dB respectively during the experiments using SVAL algorithm.By considering the reduced number of multiplications the sign regressor version of SVAL based ASE method is found to better then the counter parts. 展开更多
关键词 Adaptive algorithm speech enhancement singular spectrum analysis reference free noise canceller variable step size
下载PDF
Speech Enhancement Using Cross-Correlation Compensated Multi-Band Wiener Filter Combined with Harmonic Regeneration 被引量:1
15
作者 Venkata Rama Rao Rama Murthy K. Srinivasa Rao 《Journal of Signal and Information Processing》 2011年第2期117-124,共8页
The speech signal in general is corrupted by noise and the noise signal does not affect the speech signal uniformly over the entire spectrum. An improved Wiener filtering method is proposed in this paper for reducing ... The speech signal in general is corrupted by noise and the noise signal does not affect the speech signal uniformly over the entire spectrum. An improved Wiener filtering method is proposed in this paper for reducing background noise from speech signal in colored noise environments. In view of nonlinear variation of human ear sensibility in frequency spectrum, nonlinear multi-band Bark scale frequency spacing approach is used. The cross-correlation between the speech and noise signal is considered in the proposed method to reduce colored noise. To overcome harmonic distortion introduced in enhanced speech, in the proposed method regenerate the suppressed harmonics are regenerated. Objective and subjective tests were carried out to demonstrate improvement in the perceptual quality of speeches by the proposed technique. 展开更多
关键词 speech enhancement WIENER Filter Critical Band and speech HARMONICS
下载PDF
Speech Enhancement via Residual Dense Generative Adversarial Network 被引量:1
16
作者 Lin Zhou Qiuyue Zhong +2 位作者 Tianyi Wang Siyuan Lu Hongmei Hu 《Computer Systems Science & Engineering》 SCIE EI 2021年第9期279-289,共11页
Generative adversarial networks(GANs)are paid more attention to dealing with the end-to-end speech enhancement in recent years.Various GANbased enhancement methods are presented to improve the quality of reconstructed... Generative adversarial networks(GANs)are paid more attention to dealing with the end-to-end speech enhancement in recent years.Various GANbased enhancement methods are presented to improve the quality of reconstructed speech.However,the performance of these GAN-based methods is worse than those of masking-based methods.To tackle this problem,we propose speech enhancement method with a residual dense generative adversarial network(RDGAN)contributing to map the log-power spectrum(LPS)of degraded speech to the clean one.In detail,a residual dense block(RDB)architecture is designed to better estimate the LPS of clean speech,which can extract rich local features of LPS through densely connected convolution layers.Meanwhile,sequential RDB connections are incorporated on various scales of LPS.It significantly increases the feature learning flexibility and robustness in the time-frequency domain.Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments.Specifically,in the untrained acoustic test with limited priors,e.g.,unmatched signal-to-noise ratio(SNR)and unmatched noise category,RDGAN can still outperform the existing GAN-based methods and masking-based method in the measures of PESQ and other evaluation indexes.It indicates that our method is more generalized in untrained conditions. 展开更多
关键词 Generative adversarial networks neural networks residual dense block speech enhancement
下载PDF
Single-Channel Speech Enhancement Using Critical-Band Rate Scale Based Improved Multi-Band Spectral Subtraction 被引量:1
17
作者 Navneet Upadhyay Abhijit Karmakar 《Journal of Signal and Information Processing》 2013年第3期314-326,共13页
This paper addresses the problem of single-channel speech enhancement in the adverse environment. The critical-band rate scale based on improved multi-band spectral subtraction is investigated in this study for enhanc... This paper addresses the problem of single-channel speech enhancement in the adverse environment. The critical-band rate scale based on improved multi-band spectral subtraction is investigated in this study for enhancement of single-channel speech. In this work, the whole speech spectrum is divided into different non-uniformly spaced frequency bands in accordance with the critical-band rate scale of the psycho-acoustic model and the spectral over-subtraction is carried-out separately in each band. In addition, for the estimation of the noise from each band, the adaptive noise estimation approach is used and does not require explicit speech silence detection. The noise is estimated and updated by adaptively smoothing the noisy signal power in each band. The smoothing parameter is controlled by a-posteriori signal-to-noise ratio (SNR). For the performance analysis of the proposed algorithm, the objective measures, such as, SNR, segmental SNR, and perceptual evaluations of the speech quality are conducted for the variety of noises at different levels of SNRs. The speech spectrogram and objective evaluations of the proposed algorithm are compared with other standard speech enhancement algorithms and proved that the musical structure of the remnant noise and background noise is better suppressed by the proposed algorithm. 展开更多
关键词 SINGLE-CHANNEL speech enhancement Critical-Band RATE SCALE Spectral Over-Subtraction Adaptive Noise Estimation Objective Measure speech Spectrograms
下载PDF
DNN-Based Speech Enhancement Using Soft Audible Noise Masking for Wind Noise Reduction 被引量:1
18
作者 Haichuan Bai Fengpei Ge Yonghong Yan 《China Communications》 SCIE CSCD 2018年第9期235-243,共9页
This paper presents a deep neural network(DNN)-based speech enhancement algorithm based on the soft audible noise masking for the single-channel wind noise reduction. To reduce the low-frequency residual noise, the ps... This paper presents a deep neural network(DNN)-based speech enhancement algorithm based on the soft audible noise masking for the single-channel wind noise reduction. To reduce the low-frequency residual noise, the psychoacoustic model is adopted to calculate the masking threshold from the estimated clean speech spectrum. The gain for noise suppression is obtained based on soft audible noise masking by comparing the estimated wind noise spectrum with the masking threshold. To deal with the abruptly time-varying noisy signals, two separate DNN models are utilized to estimate the spectra of clean speech and wind noise components. Experimental results on the subjective and objective quality tests show that the proposed algorithm achieves the better performance compared with the conventional DNN-based wind noise reduction method. 展开更多
关键词 wind noise reduction speech enhancement soft audible noise masking psychoacoustic model deep neural network
下载PDF
Single Channel Speech Enhancement by De-noising Using Stationary Wavelet Transform 被引量:2
19
作者 张德祥 高清维 陈军宁 《Journal of Electronic Science and Technology of China》 2006年第1期39-42,共4页
A method of single channel speech enhancement is proposed by de-noising using stationary wavelet transform. The approach developed herein processes multi-resolution wavelet coefficients individually and then recovery ... A method of single channel speech enhancement is proposed by de-noising using stationary wavelet transform. The approach developed herein processes multi-resolution wavelet coefficients individually and then recovery signal is reconstructed. The time invariant characteristics of stationary wavelet transform is particularly useful in speech de-noising. Experimental results show that the proposed speech enhancement by de-noising algorithm is possible to achieve an excellent balance between suppresses noise effectively and preserves as many target characteristics of original signal as possible. This de-noising algorithm offers a superior performance to speech signal noise suppress. 展开更多
关键词 stationary wavelet transform speech enhancement DE-NOISING SNR
下载PDF
A Model-Based Soft Decision Approach for Speech Enhancement
20
作者 Xianyun Wang Changchun Bao Feng Bao 《China Communications》 SCIE CSCD 2017年第9期11-22,共12页
Many speech enhancement algorithms that deal with noise reduction are based on a binary masking decision(termed as the hard decision), which may cause some regions of the synthesized speech to be discarded. In view of... Many speech enhancement algorithms that deal with noise reduction are based on a binary masking decision(termed as the hard decision), which may cause some regions of the synthesized speech to be discarded. In view of the problem, a soft decision is often used as an optimal technique for speech restoration. In this paper, considering a new fashion of speech and noise models, we present two model-based soft decision techniques. One technique estimates a ratio mask generated by the exact Bayesian estimators of speech and noise. For the second technique, we consider one issue that an optimum local criterion(LC) for a certain SNR may not be appropriate for other SNRs. So we estimate a probabilistic mask with a variable LC. Experimental results show that the proposed method achieves a better performance than reference methods in speech quality. 展开更多
关键词 speech enhancement SOFT masks CASA THRESHOLD
下载PDF
上一页 1 2 61 下一页 到第
使用帮助 返回顶部