期刊文献+
共找到42篇文章
< 1 2 3 >
每页显示 20 50 100
Using Hybrid Penalty and Gated Linear Units to Improve Wasserstein Generative Adversarial Networks for Single-Channel Speech Enhancement
1
作者 Xiaojun Zhu Heming Huang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第6期2155-2172,共18页
Recently,speech enhancement methods based on Generative Adversarial Networks have achieved good performance in time-domain noisy signals.However,the training of Generative Adversarial Networks has such problems as con... Recently,speech enhancement methods based on Generative Adversarial Networks have achieved good performance in time-domain noisy signals.However,the training of Generative Adversarial Networks has such problems as convergence difficulty,model collapse,etc.In this work,an end-to-end speech enhancement model based on Wasserstein Generative Adversarial Networks is proposed,and some improvements have been made in order to get faster convergence speed and better generated speech quality.Specifically,in the generator coding part,each convolution layer adopts different convolution kernel sizes to conduct convolution operations for obtaining speech coding information from multiple scales;a gated linear unit is introduced to alleviate the vanishing gradient problem with the increase of network depth;the gradient penalty of the discriminator is replaced with spectral normalization to accelerate the convergence rate of themodel;a hybrid penalty termcomposed of L1 regularization and a scale-invariant signal-to-distortion ratio is introduced into the loss function of the generator to improve the quality of generated speech.The experimental results on both TIMIT corpus and Tibetan corpus show that the proposed model improves the speech quality significantly and accelerates the convergence speed of the model. 展开更多
关键词 speech enhancement generative adversarial networks hybrid penalty gated linear units multi-scale convolution
下载PDF
Real-Time Speech Enhancement Based on Convolutional Recurrent Neural Network
2
作者 S.Girirajan A.Pandian 《Intelligent Automation & Soft Computing》 SCIE 2023年第2期1987-2001,共15页
Speech enhancement is the task of taking a noisy speech input and pro-ducing an enhanced speech output.In recent years,the need for speech enhance-ment has been increased due to challenges that occurred in various app... Speech enhancement is the task of taking a noisy speech input and pro-ducing an enhanced speech output.In recent years,the need for speech enhance-ment has been increased due to challenges that occurred in various applications such as hearing aids,Automatic Speech Recognition(ASR),and mobile speech communication systems.Most of the Speech Enhancement research work has been carried out for English,Chinese,and other European languages.Only a few research works involve speech enhancement in Indian regional Languages.In this paper,we propose a two-fold architecture to perform speech enhancement for Tamil speech signal based on convolutional recurrent neural network(CRN)that addresses the speech enhancement in a real-time single channel or track of sound created by the speaker.In thefirst stage mask based long short-term mem-ory(LSTM)is used for noise suppression along with loss function and in the sec-ond stage,Convolutional Encoder-Decoder(CED)is used for speech restoration.The proposed model is evaluated on various speaker and noisy environments like Babble noise,car noise,and white Gaussian noise.The proposed CRN model improves speech quality by 0.1 points when compared with the LSTM base model and also CRN requires fewer parameters for training.The performance of the pro-posed model is outstanding even in low Signal to Noise Ratio(SNR). 展开更多
关键词 speech enhancement convolutional encoder-decoder long short-term memory noise suppression speech restoration
下载PDF
Speech Enhancement via Mask-Mapping Based Residual Dense Network
3
作者 Lin Zhou Xijin Chen +3 位作者 Chaoyan Wu Qiuyue Zhong Xu Cheng Yibin Tang 《Computers, Materials & Continua》 SCIE EI 2023年第1期1259-1277,共19页
Masking-based and spectrum mapping-based methods are the two main algorithms of speech enhancement with deep neural network(DNN).But the mapping-based methods only utilizes the phase of noisy speech,which limits the u... Masking-based and spectrum mapping-based methods are the two main algorithms of speech enhancement with deep neural network(DNN).But the mapping-based methods only utilizes the phase of noisy speech,which limits the upper bound of speech enhancement performance.Maskingbased methods need to accurately estimate the masking which is still the key problem.Combining the advantages of above two types of methods,this paper proposes the speech enhancement algorithm MM-RDN(maskingmapping residual dense network)based on masking-mapping(MM)and residual dense network(RDN).Using the logarithmic power spectrogram(LPS)of consecutive frames,MM estimates the ideal ratio masking(IRM)matrix of consecutive frames.RDN can make full use of feature maps of all layers.Meanwhile,using the global residual learning to combine the shallow features and deep features,RDN obtains the global dense features from the LPS,thereby improves estimated accuracy of the IRM matrix.Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments.Specifically,in the untrained acoustic test with limited priors,e.g.,unmatched signal-to-noise ratio(SNR)and unmatched noise category,MM-RDN can still outperform the existing convolutional recurrent network(CRN)method in themeasures of perceptual evaluation of speech quality(PESQ)and other evaluation indexes.It indicates that the proposed algorithm is more generalized in untrained conditions. 展开更多
关键词 Mask-mapping-based method residual dense block speech enhancement
下载PDF
Adversarial Examples Protect Your Privacy on Speech Enhancement System
4
作者 Mingyu Dong Diqun Yan Rangding Wang 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期1-12,共12页
Speech is easily leaked imperceptibly.When people use their phones,the personal voice assistant is constantly listening and waiting to be activated.Private content in speech may be maliciously extracted through automa... Speech is easily leaked imperceptibly.When people use their phones,the personal voice assistant is constantly listening and waiting to be activated.Private content in speech may be maliciously extracted through automatic speech recognition(ASR)technology by some applications on phone devices.To guarantee that the recognized speech content is accurate,speech enhancement technology is used to denoise the input speech.Speech enhancement technology has developed rapidly along with deep neural networks(DNNs),but adversarial examples can cause DNNs to fail.Considering that the vulnerability of DNN can be used to protect the privacy in speech.In this work,we propose an adversarial method to degrade speech enhancement systems,which can prevent the malicious extraction of private information in speech.Experimental results show that the generated enhanced adversarial examples can be removed most content of the target speech or replaced with target speech content by speech enhancement.The word error rate(WER)between the enhanced original example and enhanced adversarial example recognition result can reach 89.0%.WER of target attack between enhanced adversarial example and target example is low at 33.75%.The adversarial perturbation in the adversarial example can bring much more change than itself.The rate of difference between two enhanced examples and adversarial perturbation can reach more than 1.4430.Meanwhile,the transferability between different speech enhancement models is also investigated.The low transferability of the method can be used to ensure the content in the adversarial example is not damaged,the useful information can be extracted by the friendly ASR.This work can prevent the malicious extraction of speech. 展开更多
关键词 Adversarial example speech enhancement privacy protection deep neural network
下载PDF
Speech enhancement based on leakage constraints DF-GSC 被引量:1
5
作者 邹采荣 陈国明 赵力 《Journal of Southeast University(English Edition)》 EI CAS 2007年第4期507-511,共5页
In order to improve the performance of general sidelobe canceller (GSC) based speech enhancement, a leakage constraints decision feedback generalized sidelobe canceller(LCDF-GSC) algorithm is proposed. The method ... In order to improve the performance of general sidelobe canceller (GSC) based speech enhancement, a leakage constraints decision feedback generalized sidelobe canceller(LCDF-GSC) algorithm is proposed. The method adopts DF-GSC against signal mismatch, and introduces a leakage factor in the cost function to deal with the speech leakage problem which is caused by the part of the speech signal in the noise reference signal. Simulation results show that although the signal-to-noise ratio (SNR) of the speech signal through LCDF-GSC is slightly less than that of DF-GSC, the IS measurements show that the distortion of the former is less than that of the latter. MOS (mean opinion score) scores also indicate that the LCDF-GSC algorithm is better than DF- GSC and the Weiner filter algorithm, 展开更多
关键词 speech enhancement general sidelobe canceller (GSC) speech leakage
下载PDF
A speech enhancement algorithm to reduce noise and compensate for partial masking effect 被引量:4
6
作者 JEON Yu-yong LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2011年第4期1121-1127,共7页
To enhance the speech quality that is degraded by environmental noise,an algorithm was proposed to reduce the noise and reinforce the speech.The minima controlled recursive averaging(MCRA) algorithm was used to estima... To enhance the speech quality that is degraded by environmental noise,an algorithm was proposed to reduce the noise and reinforce the speech.The minima controlled recursive averaging(MCRA) algorithm was used to estimate the noise spectrum and the partial masking effect which is one of the psychoacoustic properties was introduced to reinforce speech.The performance evaluation was performed by comparing the PESQ(perceptual evaluation of speech quality) and segSNR(segmental signal to noise ratio) by the proposed algorithm with the conventional algorithm.As a result,average PESQ by the proposed algorithm was higher than the average PESQ by the conventional noise reduction algorithm and segSNR was higher as much as 3.2 dB in average than that of the noise reduction algorithm. 展开更多
关键词 speech enhancement noise reduction psychoacoustic property human hearing property
下载PDF
A continuous differentiable wavelet threshold function for speech enhancement 被引量:3
7
作者 贾海蓉 张雪英 白静 《Journal of Central South University》 SCIE EI CAS 2013年第8期2219-2225,共7页
Enhanced speech based on the traditional wavelet threshold function had auditory oscillation distortion and the low signal-to-noise ratio (SNR). In order to solve these problems, a new continuous differentiable thresh... Enhanced speech based on the traditional wavelet threshold function had auditory oscillation distortion and the low signal-to-noise ratio (SNR). In order to solve these problems, a new continuous differentiable threshold function for speech enhancement was presented. Firstly, the function adopted narrow threshold areas, preserved the smaller signal speech, and improved the speech quality; secondly, based on the properties of the continuous differentiable and non-fixed deviation, each area function was attained gradually by using the method of mathematical derivation. It ensured that enhanced speech was continuous and smooth; it removed the auditory oscillation distortion; finally, combined with the Bark wavelet packets, it further improved human auditory perception. Experimental results show that the segmental SNR and PESQ (perceptual evaluation of speech quality) of the enhanced speech using this method increase effectively, compared with the existing speech enhancement algorithms based on wavelet threshold. 展开更多
关键词 continuous differentiable wavelet threshold fimction speech enhancement Bark wavelet packet non-fixed deviation noise
下载PDF
SPEECH ENHANCEMENT USING AN MMSE SHORT TIME DCT COEFFICIENTS ESTIMATOR WITH SUPERGAUSSIAN SPEECH MODELING 被引量:4
8
作者 Zou Xia Zhang Xiongwei 《Journal of Electronics(China)》 2007年第3期332-337,共6页
In this paper,two speech enhancement systems with supergaussian speech modeling are presented. The clean speech components are estimated by Minimum-Mean-Square-Error (MMSE) es-timator under the assumption that the DCT... In this paper,two speech enhancement systems with supergaussian speech modeling are presented. The clean speech components are estimated by Minimum-Mean-Square-Error (MMSE) es-timator under the assumption that the DCT coefficients of clean speech are modeled by a Laplacian or a Gamma distribution and the DCT coefficients of the noise are Gaussian distributed. Then,MMSE estimators under speech presence uncertainty are derived. Furthermore,the proper estimators of the speech statistical parameters are proposed. The speech Laplacian factor is estimated by a new deci-sion-directed method. The simulation results show that the proposed algorithm yields less residual noise and better speech quality than the Gaussian based speech enhancement algorithms proposed in recent years. 展开更多
关键词 speech enhancement speech model Minimum-Mean-Square-Error (MMSE) Super Ganssian
下载PDF
Speech enhancement through voice activity detection using speech absence probability based on Teager energy 被引量:2
9
作者 PARKYun-sik LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2013年第2期424-432,共9页
In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (... In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (LSAP) based on the TE of noisy speech as a feature parameter for voice activity detection (VAD) in each frequency subband, rather than conventional LSAP. Results show that the TE operator can enhance the abiTity to discriminate speech and noise and further suppress noise components. Therefore, TE-based LSAP provides a better representation of LSAP, resulting in improved VAD for estimating noise power in a speech enhancement algorithm. In addition, the presented method utilizes TE-based global SAP (GSAP) derived in each frame as the weighting parameter for modifying the adopted TE operator and improving its performance. The proposed algorithm was evaluated by objective and subjective quality tests under various environments, and was shown to produce better results than the conventional method. 展开更多
关键词 speech enhancement Teager energy speech absence probability voice activity detection
下载PDF
An Efficient Reference Free Adaptive Learning Process for Speech Enhancement Applications 被引量:1
10
作者 Girika Jyoshna Md.Zia Ur Rahman L.Koteswararao 《Computers, Materials & Continua》 SCIE EI 2022年第2期3067-3080,共14页
In issues like hearing impairment,speech therapy and hearing aids play a major role in reducing the impairment.Removal of noise signals from speech signals is a key task in hearing aids as well as in speech therapy.Du... In issues like hearing impairment,speech therapy and hearing aids play a major role in reducing the impairment.Removal of noise signals from speech signals is a key task in hearing aids as well as in speech therapy.During the transmission of speech signals,several noise components contaminate the actual speech components.This paper addresses a new adaptive speech enhancement(ASE)method based on a modified version of singular spectrum analysis(MSSA).The MSSA generates a reference signal for ASE and makes the ASE is free from feeding reference component.The MSSA adopts three key steps for generating the reference from the contaminated speech only.These are decomposition,grouping and reconstruction.The generated reference is taken as a reference for variable size adaptive learning algorithms.In this work two categories of adaptive learning algorithms are used.They are step variable adaptive learning(SVAL)algorithm and time variable step size adaptive learning(TVAL).Further,sign regressor function is applied to adaptive learning algorithms to reduce the computational complexity of the proposed adaptive learning algorithms.The performance measures of the proposed schemes are calculated in terms of signal to noise ratio improvement(SNRI),excess mean square error(EMSE)and misadjustment(MSD).For cockpit noise these measures are found to be 29.2850,-27.6060 and 0.0758 dB respectively during the experiments using SVAL algorithm.By considering the reduced number of multiplications the sign regressor version of SVAL based ASE method is found to better then the counter parts. 展开更多
关键词 Adaptive algorithm speech enhancement singular spectrum analysis reference free noise canceller variable step size
下载PDF
Single-Channel Speech Enhancement Based on Improved Frame-Iterative Spectral Subtraction in the Modulation Domain 被引量:1
11
作者 Chao Li Ting Jiang Sheng Wu 《China Communications》 SCIE CSCD 2021年第9期100-115,共16页
Aiming at the problem of music noise introduced by classical spectral subtraction,a shorttime modulation domain(STM)spectral subtraction method has been successfully applied for singlechannel speech enhancement.Howeve... Aiming at the problem of music noise introduced by classical spectral subtraction,a shorttime modulation domain(STM)spectral subtraction method has been successfully applied for singlechannel speech enhancement.However,due to the inaccurate voice activity detection(VAD),the residual music noise and enhanced performance still need to be further improved,especially in the low signal to noise ratio(SNR)scenarios.To address this issue,an improved frame iterative spectral subtraction in the STM domain(IMModSSub)is proposed.More specifically,with the inter-frame correlation,the noise subtraction is directly applied to handle the noisy signal for each frame in the STM domain.Then,the noisy signal is classified into speech or silence frames based on a predefined threshold of segmented SNR.With these classification results,a corresponding mask function is developed for noisy speech after noise subtraction.Finally,exploiting the increased sparsity of speech signal in the modulation domain,the orthogonal matching pursuit(OMP)technique is employed to the speech frames for improving the speech quality and intelligibility.The effectiveness of the proposed method is evaluated with three types of noise,including white noise,pink noise,and hfchannel noise.The obtained results show that the proposed method outperforms some established baselines at lower SNRs(-5 to +5 dB). 展开更多
关键词 short-time modulation domain single-channel speech enhancement modulation improved frame iterative spectral subtraction low SNRs
下载PDF
Speech Enhancement Based on Approximate Message Passing 被引量:1
12
作者 Chao Li Ting Jiang Sheng Wu 《China Communications》 SCIE CSCD 2020年第8期187-198,共12页
To overcome the limitations of conventional speech enhancement methods, such as inaccurate voice activity detector(VAD) and noise estimation, a novel speech enhancement algorithm based on the approximate message passi... To overcome the limitations of conventional speech enhancement methods, such as inaccurate voice activity detector(VAD) and noise estimation, a novel speech enhancement algorithm based on the approximate message passing(AMP) is adopted. AMP exploits the difference between speech and noise sparsity to remove or mute the noise from the corrupted speech. The AMP algorithm is adopted to reconstruct the clean speech efficiently for speech enhancement. More specifically, the prior probability distribution of speech sparsity coefficient is characterized by Gaussian-model, and the hyper-parameters of the prior model are excellently learned by expectation maximization(EM) algorithm. We utilize the k-nearest neighbor(k-NN) algorithm to learn the sparsity with the fact that the speech coefficients between adjacent frames are correlated. In addition, computational simulations are used to validate the proposed algorithm, which achieves better speech enhancement performance than other four baseline methods-Wiener filtering, subspace pursuit(SP), distributed sparsity adaptive matching pursuit(DSAMP), and expectation-maximization Gaussian-model approximate message passing(EM-GAMP) under different compression ratios and a wide range of signal to noise ratios(SNRs). 展开更多
关键词 speech enhancement approximate message passing Gaussian model expectation maximization algorithm
下载PDF
DNN-Based Speech Enhancement Using Soft Audible Noise Masking for Wind Noise Reduction 被引量:1
13
作者 Haichuan Bai Fengpei Ge Yonghong Yan 《China Communications》 SCIE CSCD 2018年第9期235-243,共9页
This paper presents a deep neural network(DNN)-based speech enhancement algorithm based on the soft audible noise masking for the single-channel wind noise reduction. To reduce the low-frequency residual noise, the ps... This paper presents a deep neural network(DNN)-based speech enhancement algorithm based on the soft audible noise masking for the single-channel wind noise reduction. To reduce the low-frequency residual noise, the psychoacoustic model is adopted to calculate the masking threshold from the estimated clean speech spectrum. The gain for noise suppression is obtained based on soft audible noise masking by comparing the estimated wind noise spectrum with the masking threshold. To deal with the abruptly time-varying noisy signals, two separate DNN models are utilized to estimate the spectra of clean speech and wind noise components. Experimental results on the subjective and objective quality tests show that the proposed algorithm achieves the better performance compared with the conventional DNN-based wind noise reduction method. 展开更多
关键词 wind noise reduction speech enhancement soft audible noise masking psychoacoustic model deep neural network
下载PDF
Speech Enhancement via Residual Dense Generative Adversarial Network 被引量:1
14
作者 Lin Zhou Qiuyue Zhong +2 位作者 Tianyi Wang Siyuan Lu Hongmei Hu 《Computer Systems Science & Engineering》 SCIE EI 2021年第9期279-289,共11页
Generative adversarial networks(GANs)are paid more attention to dealing with the end-to-end speech enhancement in recent years.Various GANbased enhancement methods are presented to improve the quality of reconstructed... Generative adversarial networks(GANs)are paid more attention to dealing with the end-to-end speech enhancement in recent years.Various GANbased enhancement methods are presented to improve the quality of reconstructed speech.However,the performance of these GAN-based methods is worse than those of masking-based methods.To tackle this problem,we propose speech enhancement method with a residual dense generative adversarial network(RDGAN)contributing to map the log-power spectrum(LPS)of degraded speech to the clean one.In detail,a residual dense block(RDB)architecture is designed to better estimate the LPS of clean speech,which can extract rich local features of LPS through densely connected convolution layers.Meanwhile,sequential RDB connections are incorporated on various scales of LPS.It significantly increases the feature learning flexibility and robustness in the time-frequency domain.Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments.Specifically,in the untrained acoustic test with limited priors,e.g.,unmatched signal-to-noise ratio(SNR)and unmatched noise category,RDGAN can still outperform the existing GAN-based methods and masking-based method in the measures of PESQ and other evaluation indexes.It indicates that our method is more generalized in untrained conditions. 展开更多
关键词 Generative adversarial networks neural networks residual dense block speech enhancement
下载PDF
Single Channel Speech Enhancement by De-noising Using Stationary Wavelet Transform 被引量:2
15
作者 张德祥 高清维 陈军宁 《Journal of Electronic Science and Technology of China》 2006年第1期39-42,共4页
A method of single channel speech enhancement is proposed by de-noising using stationary wavelet transform. The approach developed herein processes multi-resolution wavelet coefficients individually and then recovery ... A method of single channel speech enhancement is proposed by de-noising using stationary wavelet transform. The approach developed herein processes multi-resolution wavelet coefficients individually and then recovery signal is reconstructed. The time invariant characteristics of stationary wavelet transform is particularly useful in speech de-noising. Experimental results show that the proposed speech enhancement by de-noising algorithm is possible to achieve an excellent balance between suppresses noise effectively and preserves as many target characteristics of original signal as possible. This de-noising algorithm offers a superior performance to speech signal noise suppress. 展开更多
关键词 stationary wavelet transform speech enhancement DE-NOISING SNR
下载PDF
HMM-based noise estimator for speech enhancement
16
作者 许春冬 夏日升 +2 位作者 应冬文 李军锋 颜永红 《Journal of Beijing Institute of Technology》 EI CAS 2014年第4期549-556,共8页
A noise estimator was presented in this paper by modeling the log-power sequence with hidden Markov model (HMM). The smoothing factor of this estimator was motivated by the speech presence probability at each freque... A noise estimator was presented in this paper by modeling the log-power sequence with hidden Markov model (HMM). The smoothing factor of this estimator was motivated by the speech presence probability at each frequency band. This HMM had a speech state and a nonspeech state, and each state consisted of a unique Gaussian function. The mean of the nonspeech state was the estimation of the noise logarithmic power. To make this estimator run in an on-line manner, an HMM parameter updated method was used based on a first-order recursive process. The noise signal was tracked together with the HMM to be sequentially updated. For the sake of reliability, some constraints were introduced to the HMM. The proposed algorithm was compared with the conventional ones such as minimum statistics (MS) and improved minima controlled recursive averaging (IM- CRA). The experimental results confirms its promising performance. 展开更多
关键词 noise estimation hidden markov model CONSTRAINTS first-order recursive process speech enhancement
下载PDF
Single-channel speech enhancement method based on masking properties and minimum statistics
17
作者 JiangXiaoping YaoTianren FuHua 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2004年第2期217-224,共8页
A single-channel speech enhancement method of noisy speech signals at very low signal-to-noise ratios is presented, which is based on masking properties of the human auditory system and power spectral density estimati... A single-channel speech enhancement method of noisy speech signals at very low signal-to-noise ratios is presented, which is based on masking properties of the human auditory system and power spectral density estimation of non stationary noise. It allows for an automatic adaptation in time and frequency of the parametric enhancement system, and finds the best tradeoff among the amount of noise reduction, the speech distortion, and the level of musical residual noise based on a criterion correlated with perception and SNR. This leads to a significant reduction of the unnatural structure of the residual noise. The results with several noise types show that the enhanced speech is more pleasant to a human listener. 展开更多
关键词 auditory property masking varying SNR estimation speech enhancement minimum statistics.
下载PDF
SPEECH ENHANCEMENT USING HARMONICS REGENERATION BASED ON MULTIBAND EXCITATION
18
作者 Zhang Yanfang Tang Kun Cui Huijuan 《Journal of Electronics(China)》 2011年第4期565-570,共6页
This paper proposes an algorithm that adopts the harmonic regeneration as post-processing to improve the performance of speech enhancement using traditional Short Time Spectral Amplitude(STSA).The proposed algorithm a... This paper proposes an algorithm that adopts the harmonic regeneration as post-processing to improve the performance of speech enhancement using traditional Short Time Spectral Amplitude(STSA).The proposed algorithm aims to alleviate the distortion of the high harmonics of enhanced speech via the traditional STSA,and consequently improves the speech quality.We first detect the pitch,or fundamental frequency,of the enhanced speech via the traditional STSA,and then,divide the whole spectrum into multiple sub-bands which center on each harmonic.After that,a series of specially designed windows centered on each harmonic are applied to all the sub-bands,in order to redistribute the energy in the sub-bands.The results of experiment demonstrate that the method has both theo-retical and practical basis. 展开更多
关键词 speech enhancement Short time spectral amplitude Harmonic regeneration Multiband excitation Pitch detection
下载PDF
Single channel speech enhancement via time-frequency dictionary learning 被引量:6
19
作者 HUANG Jianjun ZHANG Xiongwei +1 位作者 ZHANG Yafei ZOU Xia 《Chinese Journal of Acoustics》 2013年第1期90-102,共13页
A time-frequency dictionary learning approach is proposed to enhance speech con- taminated by additive nonstationary noise. In this framework, a time-frequency dictionary which is learned from noise data is incorporat... A time-frequency dictionary learning approach is proposed to enhance speech con- taminated by additive nonstationary noise. In this framework, a time-frequency dictionary which is learned from noise data is incorporated into the convolutive nonnegative matrix fac- torization framework. The update rules for the time-varying gains and speech dictionary are derived by precomputing the noise dictionary. The magnitude spectra of speech are estimated using convolution operation between the learned speech dictionary and the time-varying gains. Finally, noise is removed via binary time-frequency masking. The experimental results indi- cate that the proposed scheme gives better enhancement results in terms of quality measures of speech. Moreover, the proposed algorithm outperforms the multiband spectra subtraction and the non-negative sparse coding based noise reduction algorithm in nonstationary noise conditions. 展开更多
关键词 TIME WORK In STFT Single channel speech enhancement via time-frequency dictionary learning
原文传递
A speech enhancement method based on Kalman filtering 被引量:2
20
作者 SHEN Yaqiang (Zhejiang Normal Universily, Zhejiang 321004) 《Chinese Journal of Acoustics》 1994年第3期231-237,共7页
In this paper, we research the enhancement of noisy specch signals by use of Kalman Filtering. The corrupted speech signal by adding noises, which have +5 to -5dB low SNR, was used as filter object. The adding noises ... In this paper, we research the enhancement of noisy specch signals by use of Kalman Filtering. The corrupted speech signal by adding noises, which have +5 to -5dB low SNR, was used as filter object. The adding noises to be processed are broadband white noise and color noise and without correlation to speech signal. The improvement of SNR about 7 to 10 dB have been achieved. 展开更多
关键词 Kalman filtering speech enhancement
原文传递
上一页 1 2 3 下一页 到第
使用帮助 返回顶部