期刊文献+
共找到1,189篇文章
< 1 2 60 >
每页显示 20 50 100
An Analysis of the Hoarse Speech Signals by the Three Mass Model of Vocal Cords *
1
作者 程启明 陈雪丽 万德钧 《Journal of Southeast University(English Edition)》 EI CAS 1998年第1期81-85,共5页
A three mass model of vocal cords as well as mathematical expression of the model are discussed. Different kinds of typical hoarse speech due to laryngeal diseases are simulated on microcomputer and the effects of di... A three mass model of vocal cords as well as mathematical expression of the model are discussed. Different kinds of typical hoarse speech due to laryngeal diseases are simulated on microcomputer and the effects of different pathological factors of vocal cords on model parameters are studied. Some typical spectrum distribution of the simulated speech signals are given. Moreover, hoarse speech signals of some typical cases are analyzed by the methods of digital signal processing, including FFT, LPC, Cepstrum technique, Pseudocolor encoding, etc. The experiment results show that the three mass model analysis of vocal cords is an efficient method for analysis of hoarse speech signals. 展开更多
关键词 hoarse speech signal three mass model of vocal cords laryngeal diseases
下载PDF
Enhancing Parkinson’s Disease Diagnosis Accuracy Through Speech Signal Algorithm Modeling 被引量:1
2
作者 Omar M.El-Habbak Abdelrahman M.Abdelalim +5 位作者 Nour H.Mohamed Habiba M.Abd-Elaty Mostafa A.Hammouda Yasmeen Y.Mohamed Mohanad A.Taifor Ali W.Mohamed 《Computers, Materials & Continua》 SCIE EI 2022年第2期2953-2969,共17页
Parkinson’s disease(PD),one of whose symptoms is dysphonia,is a prevalent neurodegenerative disease.The use of outdated diagnosis techniques,which yield inaccurate and unreliable results,continues to represent an obs... Parkinson’s disease(PD),one of whose symptoms is dysphonia,is a prevalent neurodegenerative disease.The use of outdated diagnosis techniques,which yield inaccurate and unreliable results,continues to represent an obstacle in early-stage detection and diagnosis for clinical professionals in the medical field.To solve this issue,the study proposes using machine learning and deep learning models to analyze processed speech signals of patients’voice recordings.Datasets of these processed speech signals were obtained and experimented on by random forest and logistic regression classifiers.Results were highly successful,with 90%accuracy produced by the random forest classifier and 81.5%by the logistic regression classifier.Furthermore,a deep neural network was implemented to investigate if such variation in method could add to the findings.It proved to be effective,as the neural network yielded an accuracy of nearly 92%.Such results suggest that it is possible to accurately diagnose early-stage PD through merely testing patients’voices.This research calls for a revolutionary diagnostic approach in decision support systems,and is the first step in a market-wide implementation of healthcare software dedicated to the aid of clinicians in early diagnosis of PD. 展开更多
关键词 Early diagnosis logistic regression neural network Parkinson’s disease random forest speech signal processing algorithms
下载PDF
COMPRESSED SPEECH SIGNAL SENSING BASED ON THE STRUCTURED BLOCK SPARSITY WITH PARTIAL KNOWLEDGE OF SUPPORT 被引量:1
3
作者 JiYunyun YangZhen XuQian 《Journal of Electronics(China)》 2012年第1期62-71,共10页
Structural and statistical characteristics of signals can improve the performance of Compressed Sensing (CS). Two kinds of features of Discrete Cosine Transform (DCT) coefficients of voiced speech signals are discusse... Structural and statistical characteristics of signals can improve the performance of Compressed Sensing (CS). Two kinds of features of Discrete Cosine Transform (DCT) coefficients of voiced speech signals are discussed in this paper. The first one is the block sparsity of DCT coefficients of voiced speech formulated from two different aspects which are the distribution of the DCT coefficients of voiced speech and the comparison of reconstruction performance between the mixed program and Basis Pursuit (BP). The block sparsity of DCT coefficients of voiced speech means that some algorithms of block-sparse CS can be used to improve the recovery performance of speech signals. It is proved by the simulation results of the mixed program which is an improved version of the mixed program. The second one is the well known large DCT coefficients of voiced speech focus on low frequency. In line with this feature, a special Gaussian and Partial Identity Joint (GPIJ) matrix is constructed as the sensing matrix for voiced speech signals. Simulation results show that the GPIJ matrix outperforms the classical Gaussian matrix for speech signals of male and female adults. 展开更多
关键词 Compressed Sensing (CS) speech signals Sensing matrix Block sparsity
下载PDF
Speech Signal Detection Based on Bayesian Estimation by Observing Air-Conducted Speech under Existence of Surrounding Noise with the Aid of Bone-Conducted Speech 被引量:1
4
作者 Hisako Orimoto Akira Ikuta Kouji Hasegawa 《Intelligent Information Management》 2021年第4期199-213,共15页
In order to apply speech recognition systems to actual circumstances such as inspection and maintenance operations in industrial factories to recording and reporting routines at construction sites, etc. where hand-wri... In order to apply speech recognition systems to actual circumstances such as inspection and maintenance operations in industrial factories to recording and reporting routines at construction sites, etc. where hand-writing is difficult, some countermeasure methods for surrounding noise are indispensable. In this study, a signal detection method to remove the noise for actual speech signals is proposed by using Bayesian estimation with the aid of bone-conducted speech. More specifically, by introducing Bayes’ theorem based on the observation of air-conducted speech contaminated by surrounding background noise, a new type of algorithm for noise removal is theoretically derived. In the proposed speech detection method, bone-conducted speech is utilized in order to obtain precise estimation for speech signals. The effectiveness of the proposed method is experimentally confirmed by applying it to air- and bone-conducted speeches measured in real environment under the existence of surrounding background noise. 展开更多
关键词 speech signal Detection Bayesian Estimation Air- and Bone-Conducted speeches Surrounding Noise
下载PDF
A DISTRIBUTED COMPRESSED SENSING APPROACH FOR SPEECH SIGNAL DENOISING
5
作者 Ji Yunyun Yang Zhen 《Journal of Electronics(China)》 2011年第4期509-517,共9页
Compressed sensing,a new area of signal processing rising in recent years,seeks to minimize the number of samples that is necessary to be taken from a signal for precise reconstruction.The precondition of compressed s... Compressed sensing,a new area of signal processing rising in recent years,seeks to minimize the number of samples that is necessary to be taken from a signal for precise reconstruction.The precondition of compressed sensing theory is the sparsity of signals.In this paper,two methods to estimate the sparsity level of the signal are formulated.And then an approach to estimate the sparsity level directly from the noisy signal is presented.Moreover,a scheme based on distributed compressed sensing for speech signal denoising is described in this work which exploits multiple measurements of the noisy speech signal to construct the block-sparse data and then reconstruct the original speech signal using block-sparse model-based Compressive Sampling Matching Pursuit(CoSaMP) algorithm.Several simulation results demonstrate the accuracy of the estimated sparsity level and that this de-noising system for noisy speech signals can achieve favorable performance especially when speech signals suffer severe noise. 展开更多
关键词 Distributed compressed sensing Sparsity estimation speech signal DENOISING
下载PDF
Implementation of Hybrid Deep Reinforcement Learning Technique for Speech Signal Classification
6
作者 R.Gayathri K.Sheela Sobana Rani 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期43-56,共14页
Classification of speech signals is a vital part of speech signal processing systems.With the advent of speech coding and synthesis,the classification of the speech signal is made accurate and faster.Conventional meth... Classification of speech signals is a vital part of speech signal processing systems.With the advent of speech coding and synthesis,the classification of the speech signal is made accurate and faster.Conventional methods are considered inaccurate due to the uncertainty and diversity of speech signals in the case of real speech signal classification.In this paper,we use efficient speech signal classification using a series of neural network classifiers with reinforcement learning operations.Prior classification of speech signals,the study extracts the essential features from the speech signal using Cepstral Analysis.The features are extracted by converting the speech waveform to a parametric representation to obtain a relatively minimized data rate.Hence to improve the precision of classification,Generative Adversarial Networks are used and it tends to classify the speech signal after the extraction of features from the speech signal using the cepstral coefficient.The classifiers are trained with these features initially and the best classifier is chosen to perform the task of classification on new datasets.The validation of testing sets is evaluated using RL that provides feedback to Classifiers.Finally,at the user interface,the signals are played by decoding the signal after being retrieved from the classifier back based on the input query.The results are evaluated in the form of accuracy,recall,precision,f-measure,and error rate,where generative adversarial network attains an increased accuracy rate than other methods:Multi-Layer Perceptron,Recurrent Neural Networks,Deep belief Networks,and Convolutional Neural Networks. 展开更多
关键词 Neural network(NN) reinforcement learning(RL) cepstral coefficient speech signal classification
下载PDF
Analysis of Deaf Speakers’ Speech Signal for Understanding the Acoustic Characteristics by Territory Specific Utterances
7
作者 Nirmaladevi Jaganathan Bommannaraja Kanagaraj 《Circuits and Systems》 2016年第8期1709-1721,共13页
An important concern with the deaf community is inability to hear partially or totally. This may affect the development of language during childhood, which limits their habitual existence. Consequently to facilitate s... An important concern with the deaf community is inability to hear partially or totally. This may affect the development of language during childhood, which limits their habitual existence. Consequently to facilitate such deaf speakers through certain assistive mechanism, an effort has been taken to understand the acoustic characteristics of deaf speakers by evaluating the territory specific utterances. Speech signals are acquired from 32 normal and 32 deaf speakers by uttering ten Indian native Tamil language words. The speech parameters like pitch, formants, signal-to-noise ratio, energy, intensity, jitter and shimmer are analyzed. From the results, it has been observed that the acoustic characteristics of deaf speakers differ significantly and their quantitative measure dominates the normal speakers for the words considered. The study also reveals that the informative part of speech in a normal and deaf speakers may be identified using the acoustic features. In addition, these attributes may be used for differential corrections of deaf speaker’s speech signal and facilitate listeners to understand the conveyed information. 展开更多
关键词 Deaf Speaker Hard of Hearing Deaf speech Processing Assistive Mechanism for Deaf Speaker speech Correction speech signal Processing
下载PDF
SELECTION OF PROPER EMBEDDING DIMENSION IN PHASE SPACE RECONSTRUCTION OF SPEECH SIGNALS
8
作者 Lin Jiayu Huang Zhiping Wang Yueke Shen Zhenken (Dept.4 and Dept.8, Nat/onaJ University of Defence Technology, Changsha 410073) 《Journal of Electronics(China)》 2000年第2期161-169,共9页
In phase space reconstruction of time series, the selection of embedding dimension is important. Based on the idea of checking the behavior of near neighbors in the reconstruction dimension, a new method to determine ... In phase space reconstruction of time series, the selection of embedding dimension is important. Based on the idea of checking the behavior of near neighbors in the reconstruction dimension, a new method to determine proper minimum embedding dimension is constructed. This method has a sound theoretical basis and can lead to good result. It can indicate the noise level in the data to be reconstructed, and estimate the reconstruction quality. It is applied to speech signal reconstruction and the generic embedding dimension of speech signals is deduced. 展开更多
关键词 speech signals CHAOS Phase space RECONSTRUCTION EMBEDDING DIMENSION False nearest NEIGHBOR Noise level estimation RECONSTRUCTION quality
下载PDF
Heart Rate Extraction from Vowel Speech Signals 被引量:6
9
作者 Abdelwadood Mesleh Dmitriy Skopin +1 位作者 Sergey Baglikov Anas Quteishat 《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第6期1243-1251,共9页
This paper presents a novel non-contact heart rate extraction method from vowel speech signals. The proposed method is based on modeling the relationship between speech production of vowel speech signals and heart act... This paper presents a novel non-contact heart rate extraction method from vowel speech signals. The proposed method is based on modeling the relationship between speech production of vowel speech signals and heart activities for humans where it is observed that the moment of heart beat causes a short increment (evolution) of vowel speech formants. The short-time Fourier transform (STFT) is used to detect the formant maximum peaks so as to accurately estimate the heart rate. Compared with traditional contact pulse oximeter, the average accuracy of the proposed non-contact heart rate extraction method exceeds 95%. The proposed non-contact heart rate extraction method is expected to play an important role in modern medical applications. 展开更多
关键词 ELECTROCARDIOGRAM feature extraction heart rate short-tlme Fourier transform vowel speech signal
原文传递
A coherent method for finding arrival directions of speech signals and its application for noise reduction in microphone array
10
作者 Zheng Liu and Fumitada Itakura (Department of Electrical Engineering, Faculty of Engineering,Nagoya University Furo-Cho, Chikusa-Ku, Nagoya, 464-01, Japan ) 《Chinese Journal of Acoustics》 1997年第3期214-228,共15页
The research on finding the arrival directions of speech signals by microphone arrny is proposed. We first analyze the uniform microphone array and give the design for microphone array applied in the hand-free speech ... The research on finding the arrival directions of speech signals by microphone arrny is proposed. We first analyze the uniform microphone array and give the design for microphone array applied in the hand-free speech recognition. Combining the traditional direction finding technique of MUltiple SIgnal Classification (MUSIC) with the focusing matrix method, we improve the resolving power of the microphone array for multiple speech sources.As one application of finding Direction of Arrival (DOA), a new microphone-array system for noise reduction is proposed. The new system is based on maximum likelihood estimate technique which reconstruct superimposed signals from different directions by using DOA information. The DOA information is got in terms of focusing MUSIC method which has been proven to have high performance than conventional MUSIC method on speaker localization[1]. 展开更多
关键词 IEEE ASSP A coherent method for finding arrival directions of speech signals and its application for noise reduction in microphone array
原文传递
Support vector machines for emotion recognition in Chinese speech 被引量:8
11
作者 王治平 赵力 邹采荣 《Journal of Southeast University(English Edition)》 EI CAS 2003年第4期307-310,共4页
Support vector machines (SVMs) are utilized for emotion recognition in Chinese speech in this paper. Both binary class discrimination and the multi class discrimination are discussed. It proves that the emotional fe... Support vector machines (SVMs) are utilized for emotion recognition in Chinese speech in this paper. Both binary class discrimination and the multi class discrimination are discussed. It proves that the emotional features construct a nonlinear problem in the input space, and SVMs based on nonlinear mapping can solve it more effectively than other linear methods. Multi class classification based on SVMs with a soft decision function is constructed to classify the four emotion situations. Compared with principal component analysis (PCA) method and modified PCA method, SVMs perform the best result in multi class discrimination by using nonlinear kernel mapping. 展开更多
关键词 speech signal emotion recognition support vector machines
下载PDF
STUDY ON PHASE PERCEPTION IN SPEECH 被引量:6
12
作者 TongMing BianZhengzhong +2 位作者 LiXiaohui DaiQijun ChenYanpu 《Journal of Electronics(China)》 2003年第5期387-392,共6页
The perceptual effect of the phase information in speech has been studied by auditorysubjective tests. On the condition that the phase spectrum in speech is changed while amplitudespectrum is unchanged, the tests show... The perceptual effect of the phase information in speech has been studied by auditorysubjective tests. On the condition that the phase spectrum in speech is changed while amplitudespectrum is unchanged, the tests show that: (1) If the envelop of the reconstructed speech signalis unchanged, there is indistinctive auditory perception between the original speech and thereconstructed speech; (2) The auditory perception effect of the reconstructed speech mainly lieson the amplitude of the derivative of the additive phase; (3) td is the maximum relative time shiftbetween different frequency components of the reconstructed speech signal. The speech qualityis excellent while td <10ms; good while 10ms< td <20ms; common while 20ms< td <35ms, andpoor while td >35ms. 展开更多
关键词 speech signal Auditory perception Phase spectrum Additive phase
下载PDF
BLIND SPEECH SEPARATION FOR ROBOTS WITH INTELLIGENT HUMAN-MACHINE INTERACTION
13
作者 Huang Yulei Ding Zhizhong +1 位作者 Dai Lirong Chen Xiaoping 《Journal of Electronics(China)》 2012年第3期286-293,共8页
Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation... Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation (BSS) for intelligent Human-Machine Interaction(HMI). Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at different time delays for every frequency bins in time-frequency domain. The prososed method has two merits: (1) fast convergence speed; (2) high signal to interference ratio of the separated signals. Numerical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms. An efficient algorithm to resolve permutation ambiguity is also proposed in this paper. The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches. 展开更多
关键词 Blind Source Separation (BSS) Blind deconvolution speech signal processing Human-machine interaction Simultaneous diagonalization
下载PDF
Speech Encryption with Fractional Watermark
14
作者 Yan Sun Cun Zhu Qi Cui 《Computers, Materials & Continua》 SCIE EI 2022年第10期1817-1825,共9页
Research on the feature of speech and image signals are carried out from two perspectives,the time domain and the frequency domain.The speech and image signals are a non-stationary signal,so FT is not used for the non... Research on the feature of speech and image signals are carried out from two perspectives,the time domain and the frequency domain.The speech and image signals are a non-stationary signal,so FT is not used for the non-stationary characteristics of the signal.When short-term stable speech is obtained by windowing and framing the subsequent processing of the signal is completed by the Discrete Fourier Transform(DFT).The Fast Discrete Fourier Transform is a commonly used analysis method for speech and image signal processing in frequency domain.It has the problem of adjusting window size to a for desired resolution.But the Fractional Fourier Transform can have both time domain and frequency domain processing capabilities.This paper performs global processing speech encryption by combining speech with image of Fractional Fourier Transform.The speech signal is embedded watermark image that is processed by fractional transformation,and the embedded watermark has the effect of rotation and superposition,which improves the security of the speech.The paper results show that the proposed speech encryption method has a higher security level by Fractional Fourier Transform.The technology is easy to extend to practical applications. 展开更多
关键词 Fractional Fourier Transform WATERMARK speech signal processing image processing
下载PDF
Enhanced Frequency-Domain Frost Algorithm Using Conjugate Gradient Techniques for Speech Enhancement 被引量:1
15
作者 Shengkui Zhao Douglas L. Jones 《Journal of Electronic Science and Technology》 CAS 2012年第2期158-162,共5页
In this paper, the frequency-domain Frost algorithm is enhanced by using conjugate gradient techniques for speech enhancement. Unlike the non-adaptive approach of computing the optimum minimum variance distortionless ... In this paper, the frequency-domain Frost algorithm is enhanced by using conjugate gradient techniques for speech enhancement. Unlike the non-adaptive approach of computing the optimum minimum variance distortionless response (MVDR) solution with the correlation matrix inversion, the Frost algorithm implementing the stochastic constrained least mean square (LMS) algorithm can adaptively converge to the MVDR solution in mean-square sense, but with a very slow convergence rate. In this paper, we propose a frequency-domain constrained conjugate gradient (FDCCG) algorithm to speed up the convergence. The devised FDCCG algorithm avoids the matrix inversion and exhibits fast convergence. The speech enhancement experiments for the target speech signal corrupted by two and five interfering speech signals are demonstrated by using a four-channel acoustic-vector-sensor (AVS) micro-phone array and show the superior performance. 展开更多
关键词 Adaptive gence correlation speech arrays. signal processing conver- enhancement MICROPHONE
下载PDF
High Performance Speech Compression System 被引量:6
16
作者 Ke Liu, Zhichun Mu, Zhong Wang Information Engineering School, University of Science & Technology Beijing, Beijing 100083, China 《Journal of University of Science and Technology Beijing》 CSCD 2001年第3期229-233,共5页
Since Pulse Code Modulation emerged in 1937, digitized speech has experienced rapid development due to its outstanding voice quality, reliability, robustness and security in communication. But how to reduce channel wi... Since Pulse Code Modulation emerged in 1937, digitized speech has experienced rapid development due to its outstanding voice quality, reliability, robustness and security in communication. But how to reduce channel width without loss of speech quality remains a crucial problem in speech coding theory. A new full-duplex digital speech communication system based on the Vocoder of AMBE-1000(TM) and microcontroller ATMEL 89C51 is introduced. It shows higher voice quality than current mobile phone system with only a quarter of channel width needed for the latter. The prospective areas in which the system can be applied include satellite communication, IP Phone, virtual meeting and the most important, defence industry. 展开更多
关键词 digital signal processing digital speech compression digital communication full-duplex coding rate
下载PDF
Artificial Intelligence for Speech Recognition Based on Neural Networks 被引量:3
17
作者 Takialddin Al Smadi Huthaifa A. Al Issa +1 位作者 Esam Trad Khalid A. Al Smadi 《Journal of Signal and Information Processing》 2015年第2期66-72,共7页
Speech recognition or speech to text includes capturing and digitizing the sound waves, transformation of basic linguistic units or phonemes, constructing words from phonemes and contextually analyzing the words to en... Speech recognition or speech to text includes capturing and digitizing the sound waves, transformation of basic linguistic units or phonemes, constructing words from phonemes and contextually analyzing the words to ensure the correct spelling of words that sounds the same. Approach: Studying the possibility of designing a software system using one of the techniques of artificial intelligence applications neuron networks where this system is able to distinguish the sound signals and neural networks of irregular users. Fixed weights are trained on those forms first and then the system gives the output match for each of these formats and high speed. The proposed neural network study is based on solutions of speech recognition tasks, detecting signals using angular modulation and detection of modulated techniques. 展开更多
关键词 speech RECOGNITION NEURAL NETWORKS Artificial NETWORKS signalS Processing
下载PDF
SPEECH ENHANCEMENT BASED ON SECOND ORDER ARCHITECTURE AND INFORMATION MAXIMIZATION THEORY
18
作者 虞晓 胡光锐 陈玮 《Journal of Shanghai Jiaotong university(Science)》 EI 1998年第2期58-62,共5页
Based on the idea of adaptive noise cancellation (ANC), a second order architecture is proposed for speech enhancement. According as the Information Maximization theory, the corresponding gradient descend algorithm is... Based on the idea of adaptive noise cancellation (ANC), a second order architecture is proposed for speech enhancement. According as the Information Maximization theory, the corresponding gradient descend algorithm is proposed. With real speech signals in the simulation, the new algorithm demonstrates its good performance in speech enhancement. The main advantage of the new architecture is that clean speech signals can be got with less distortion. 展开更多
关键词 speech ENHANCEMENT BLIND signal separation INFORMATION MAXIMIZATION ANC
下载PDF
基于时域卷积网络的两阶段语音增强算法
19
作者 周翊 王艺 +1 位作者 赵宇 刘宏清 《信号处理》 CSCD 北大核心 2024年第12期2219-2227,共9页
在语音信号的传输过程中,通常会受到噪声和回声等因素的干扰,从而导致信号的质量和可懂度下降。为了从信号中去除噪声和干扰,提高语音信号的质量,语音增强算法应运而生。与传统算法相比,基于深度学习的语音增强算法取得了更好的效果。... 在语音信号的传输过程中,通常会受到噪声和回声等因素的干扰,从而导致信号的质量和可懂度下降。为了从信号中去除噪声和干扰,提高语音信号的质量,语音增强算法应运而生。与传统算法相比,基于深度学习的语音增强算法取得了更好的效果。然而现有的算法存在以下问题:现有算法在设计时普遍只考虑到了含噪语音中的语音成分,未能充分考虑噪声成分,且现有的算法大多是用单一网络完成语音增强任务,这需要网络具有较高的性能。对此,本文提出了用于语音增强的频谱掩蔽两阶段时频处理网络(Spectral Masking Two-Stage TimeFrequency Processing Network,SM-TSTFN)。该网络将语音增强的过程分解为幅度谱预测和复数谱预测两个阶段,渐进地估计出纯净语音。在第一阶段,将噪声和语音作为学习的目标,利用含噪语音的幅度谱作为输入,初步估计出噪声和语音的幅度谱。第二阶段,使用含噪语音的复数谱作为输入,在第一阶段预测结果的帮助下,估计出纯净语音的频谱。在第二阶段中,本文还设计了一种时频处理模块(Time-Frequency Processing Module,TFPM)。该模块结合了长短时记忆网络(Long Short-Term Memory,LSTM)和时域卷积网络(Temporal Convolution Module,TCN),能够分别从时域和频域维度提取特征。在数据集上的实验结果表明,本文提出的SM-TSTFN相比于其他模型取得了更高的分数,能够更有效和更准确地改善语音信号的质量,并提升语音的可懂度。 展开更多
关键词 深度学习 语音增强 信号处理
下载PDF
基于小波变换的语音信号去噪算法优化
20
作者 王红娟 尚莹莹 《电声技术》 2024年第5期67-69,共3页
深入研究基于小波变换的语音信号去噪方法,并针对传统方法在复杂噪声环境下处理效果不佳的问题,提出一种基于自适应阈值的小波变换去噪优化方法。首先,分析小波变换去噪的基本原理。其次,深入研究自适应阈值技术的数学模型,并将其应用... 深入研究基于小波变换的语音信号去噪方法,并针对传统方法在复杂噪声环境下处理效果不佳的问题,提出一种基于自适应阈值的小波变换去噪优化方法。首先,分析小波变换去噪的基本原理。其次,深入研究自适应阈值技术的数学模型,并将其应用于小波变换,通过动态调整阈值来适应不同噪声环境的需求。最后,采用Aurora数据集进行实验验证。实验结果表明,该方法能够有效去除噪声。 展开更多
关键词 小波变换 语音去噪 自适应阈值 语音信号
下载PDF
上一页 1 2 60 下一页 到第
使用帮助 返回顶部