A three mass model of vocal cords as well as mathematical expression of the model are discussed. Different kinds of typical hoarse speech due to laryngeal diseases are simulated on microcomputer and the effects of di...A three mass model of vocal cords as well as mathematical expression of the model are discussed. Different kinds of typical hoarse speech due to laryngeal diseases are simulated on microcomputer and the effects of different pathological factors of vocal cords on model parameters are studied. Some typical spectrum distribution of the simulated speech signals are given. Moreover, hoarse speech signals of some typical cases are analyzed by the methods of digital signal processing, including FFT, LPC, Cepstrum technique, Pseudocolor encoding, etc. The experiment results show that the three mass model analysis of vocal cords is an efficient method for analysis of hoarse speech signals.展开更多
The perceptual effect of the phase information in speech has been studied by auditorysubjective tests. On the condition that the phase spectrum in speech is changed while amplitudespectrum is unchanged, the tests show...The perceptual effect of the phase information in speech has been studied by auditorysubjective tests. On the condition that the phase spectrum in speech is changed while amplitudespectrum is unchanged, the tests show that: (1) If the envelop of the reconstructed speech signalis unchanged, there is indistinctive auditory perception between the original speech and thereconstructed speech; (2) The auditory perception effect of the reconstructed speech mainly lieson the amplitude of the derivative of the additive phase; (3) td is the maximum relative time shiftbetween different frequency components of the reconstructed speech signal. The speech qualityis excellent while td <10ms; good while 10ms< td <20ms; common while 20ms< td <35ms, andpoor while td >35ms.展开更多
A filter algorithm based on cochlear mechanics and neuron filter mechanism is proposed from the view point of vibration.It helps to solve the problem that the non-linear amplification is rarely considered in studying ...A filter algorithm based on cochlear mechanics and neuron filter mechanism is proposed from the view point of vibration.It helps to solve the problem that the non-linear amplification is rarely considered in studying the auditory filters.A cochlear mechanical transduction model is built to illustrate the audio signals processing procedure in cochlea,and then the neuron filter mechanism is modeled to indirectly obtain the outputs with the cochlear properties of frequency tuning and non-linear amplification.The mathematic description of the proposed algorithm is derived by the two models.The parameter space,the parameter selection rules and the error correction of the proposed algorithm are discussed.The unit impulse responses in the time domain and the frequency domain are simulated and compared to probe into the characteristics of the proposed algorithm.Then a 24-channel filter bank is built based on the proposed algorithm and applied to the enhancements of the audio signals.The experiments and comparisons verify that,the proposed algorithm can effectively divide the audio signals into different frequencies,significantly enhance the high frequency parts,and provide positive impacts on the performance of speech enhancement in different noise environments,especially for the babble noise and the volvo noise.展开更多
Speech signals in frequency domain were separated based on discrete wavelet transform (DWT) and independent component analysis (ICA). First, mixed speech signals were decomposed into different frequency domains by DWT...Speech signals in frequency domain were separated based on discrete wavelet transform (DWT) and independent component analysis (ICA). First, mixed speech signals were decomposed into different frequency domains by DWT and the subbands of speech signals were separated using ICA in each wavelet domain; then, the permutation and scaling problems of frequency domain blind source separation (BSS) were solved by utilizing the correlation between adjacent bins in speech signals; at last, source signals were reconstructed from single branches. Experiments were carried out with 2 sources and 6 microphones using speech signals at sampling rate of 40 kHz. The microphones were aligned with 2 sources in front of them, on the left and right. The separation of one male and one female speeches lasted 2.5 s. It is proved that the new method is better than single ICA method and the signal to noise ratio is improved by 1 dB approximately.展开更多
In conventional source-filter models, voiced and unvoiced components were considered independently. However, in practice it was difficult to separate the source into two parts. An actual source consists of a mixture o...In conventional source-filter models, voiced and unvoiced components were considered independently. However, in practice it was difficult to separate the source into two parts. An actual source consists of a mixture of two sources and the ratio varies according to the content or the intention of speaker. It had been investigated to separate the voiced and unvoiced components for different source models. Source signals were modeled based on the residual signal measured from inverse filtering. Three different source models were assumed. The parameters of each model were optimized for the original speech signal using a genetic algorithm. The resulting parameters were compared in terms of the mel-cepstral distance to the original signal, the spectrogram and the spectral envelope from the synthesized signal. The optimization method achieves an improvement of 15% for the Klatt model, but there is little improvement in the modified residual case.展开更多
In this paper, a Covert Speech Telephone (CST) is designed and implemented based on the information hiding technique, which works on the internet. To solve the large embedding capacity problem for real-time informatio...In this paper, a Covert Speech Telephone (CST) is designed and implemented based on the information hiding technique, which works on the internet. To solve the large embedding capacity problem for real-time information hiding, a steganographic system combined with a watermarking scheme is proposed, which skillfully transfers the secret speech into watermarking information. The basic idea is to use the speech recognition to significantly reduce the size of information that has to be transmitted in a hidden way. Furthermore, an improved DFT watermarking scheme is proposed which adaptively chooses the embedding locations and applies the multi-ary modulation technique. Based on the GUI (Graphical User Interface) software, the CST operates on both ordinary and secure mode. It is a completely digital system with high speech quality. Objective and subjective tests show that the CST is robust against normal signal processing attacks and steganalysis. The proposed scheme can be used in terms of military applications.展开更多
To capture the presence of speech embedded in nonspeech events and background noise in shortwave non-cooperative communication, an algorithm for speech-stream detection in noisy environments is presented based on Empi...To capture the presence of speech embedded in nonspeech events and background noise in shortwave non-cooperative communication, an algorithm for speech-stream detection in noisy environments is presented based on Empirical Mode Decomposition (EMD) and statistical properties of higher-order cumulants of speech signals. With the EMD, the noise signals can be decomposed into different numbers of IMFs. Then, the fourth-order cumulant ( FOC ) can be used to extract the desired feature of statistical properties for IMF components. Since the higher-order eumulants are blind for Gaussian signals, the proposed method is especially effective regarding the problem of speech-stream detection, where the speech signal is distorted by Gaussian noise. With the self-adaptive decomposition by EMD, the proposed method can also work well for non-Gaussian noise. The experiments show that the proposed algorithm can suppress different noise types with different SNRs, and the algorithm is robust in real signal tests.展开更多
This research studies the features of chest and abdominal breathing in Zhuang language.Two participants were recruited to record 30 news articles of Zhuang language.The chest and abdominal breathing signals as well as...This research studies the features of chest and abdominal breathing in Zhuang language.Two participants were recruited to record 30 news articles of Zhuang language.The chest and abdominal breathing signals as well as speech signal were recorded simultaneously. Programs for breathing analysis have been written to extract parameters such as breathing reset amplitude, time of inhale phase, and slope of exhale phase. The results show that the times of inhale and exhale reset of abdominal breathing are earlier than chest breathing, the breathing reset is related to prosodic boundaries展开更多
The traditional correlation-based detector is optimal only for Gaussian data, but the Laplacian Probability Density Function (PDF) is more appropriate to model the coefficients in the Discrete Ridgelet Transform (DRT)...The traditional correlation-based detector is optimal only for Gaussian data, but the Laplacian Probability Density Function (PDF) is more appropriate to model the coefficients in the Discrete Ridgelet Transform (DRT) domain. An additive maximum-likelihood detector based on the Laplacian PDF is analyzed and the theoretical result of its performance is given. The experiments show that the error of the Laplacian model for the DRT coefficients of many images is smaller than that of the Gaussian model. The experiments also prove that the Laplacian detector is superior to the tradi- tional correlation-based detector.展开更多
Enhanced speech based on the traditional wavelet threshold function had auditory oscillation distortion and the low signal-to-noise ratio (SNR). In order to solve these problems, a new continuous differentiable thresh...Enhanced speech based on the traditional wavelet threshold function had auditory oscillation distortion and the low signal-to-noise ratio (SNR). In order to solve these problems, a new continuous differentiable threshold function for speech enhancement was presented. Firstly, the function adopted narrow threshold areas, preserved the smaller signal speech, and improved the speech quality; secondly, based on the properties of the continuous differentiable and non-fixed deviation, each area function was attained gradually by using the method of mathematical derivation. It ensured that enhanced speech was continuous and smooth; it removed the auditory oscillation distortion; finally, combined with the Bark wavelet packets, it further improved human auditory perception. Experimental results show that the segmental SNR and PESQ (perceptual evaluation of speech quality) of the enhanced speech using this method increase effectively, compared with the existing speech enhancement algorithms based on wavelet threshold.展开更多
Sample entropy can reflect the change of level of new information in signal sequence as well as the size of the new information. Based on the sample entropy as the features of speech classification, the paper firstly ...Sample entropy can reflect the change of level of new information in signal sequence as well as the size of the new information. Based on the sample entropy as the features of speech classification, the paper firstly extract the sample entropy of mixed signal, mean and variance to calculate each signal sample entropy, finally uses the K mean clustering to recognize. The simulation results show that: the recognition rate can be increased to 89.2% based on sample entropy.展开更多
To develop a more robust endpoint detection algorithm, this paper first proposes a fuzzy adaptive smoothing algorithm. The general idea underlying adaptive smoothing is to adapt the short-term sub-band mean of the amp...To develop a more robust endpoint detection algorithm, this paper first proposes a fuzzy adaptive smoothing algorithm. The general idea underlying adaptive smoothing is to adapt the short-term sub-band mean of the amplitude to the local attributes of speech on the basis of discontinuity measures. The adaptive smoothing algorithm in this paper utilizes a scale-space framework through the minimal description length (MDL). We recommend using the fuzzy muhi-attribute decision making approach to select the proper sub-bands where the word boundary can be more reliably detected. The process and simulation of the fuzzy adaptive smoothing algorithm are given. The parameters utilize the mean amplitude of the audible frequency range (300 -3 700 Hz) and the sub-band mean of the amplitude (16 band filter-bank). We selected the audible band energy because of its usefulness in detecting high-energy regions and making the distinction between speech and noise. Otherwise, the fuzzy adaptive smoothing algorithm is processed in sub-band speech to utilize the full range of frequency information.展开更多
This letter investigates an improved blind source separation algorithm based on Maximum Entropy (ME) criteria. The original ME algorithm chooses the fixed exponential or sigmoid ftmction as the nonlinear mapping fun...This letter investigates an improved blind source separation algorithm based on Maximum Entropy (ME) criteria. The original ME algorithm chooses the fixed exponential or sigmoid ftmction as the nonlinear mapping function which can not match the original signal very well. A parameter estimation method is employed in this letter to approach the probability of density function of any signal with parameter-steered generalized exponential function. An improved learning rule and a natural gradient update formula of unmixing matrix are also presented. The algorithm of this letter can separate the mixture of super-Gaussian signals and also the mixture of sub-Gaussian signals. The simulation experiment demonstrates the efficiency of the algorithm.展开更多
In this paper, we applied RobustICA to speech separation and made a comprehensive comparison to FastICA according to the separation results. Through a series of speech signal separation test, RobustICA reduced the sep...In this paper, we applied RobustICA to speech separation and made a comprehensive comparison to FastICA according to the separation results. Through a series of speech signal separation test, RobustICA reduced the separation time consumed by FastICA with higher stability, and speeches separated by RobustICA were proved to having lower separation errors. In the 14 groups of speech separation tests, separation time consumed by RobustICA was 3.185 s less than FastICA by nearly 68%. Separation errors of FastICA had a float between 0.004 and 0.02, while the errors of RobustlCA remained around 0.003. Furthermore, compared to FastICA, RobustlCA showed better separation robustness. Experimental results showed that RohustICA was successful to apply to the speech signal separation, and showed superiority to FastlCA in speech separation.展开更多
Based on W-disjoint orthogonality of speech mixtures, a space d,scnmlnative tunetlon was proposer1 to enumerate and localize competing speakers in the surrounding environments. Then, a Wiener-like postfiherer was deve...Based on W-disjoint orthogonality of speech mixtures, a space d,scnmlnative tunetlon was proposer1 to enumerate and localize competing speakers in the surrounding environments. Then, a Wiener-like postfiherer was developed to adaptively suppress interferences. Experimental results with a hands-free speech recognizer under various SNR and competing speakers settings show that nearly 69 % error reduction can be obtained with a two-channel small aperture microphone array against the conventional single microphone baseline system. Comparisons were made against traditional delay-and-sum and Griffiths-Jim adaptive beamforming techniques to further assess the effectiveness of this method.展开更多
Based on the approximate sparseness of speech in wavelet basis,a compressed sensing theory is applied to compress and reconstruct speech signals.Compared with one-dimensional orthogonal wavelet transform(OWT),two-dime...Based on the approximate sparseness of speech in wavelet basis,a compressed sensing theory is applied to compress and reconstruct speech signals.Compared with one-dimensional orthogonal wavelet transform(OWT),two-dimensional OWT combined with Dmeyer and biorthogonal wavelet is firstly proposed to raise running efficiency in speech frame processing,furthermore,the threshold is set to improve the sparseness.Then an adaptive subgradient projection method(ASPM)is adopted for speech reconstruction in compressed sensing.Meanwhile,mechanism which adaptively adjusts inflation parameter in different iterations has been designed for fast convergence.Theoretical analysis and simulation results conclude that this algorithm has fast convergence,and lower reconstruction error,and also exhibits higher robustness in different noise intensities.展开更多
文摘A three mass model of vocal cords as well as mathematical expression of the model are discussed. Different kinds of typical hoarse speech due to laryngeal diseases are simulated on microcomputer and the effects of different pathological factors of vocal cords on model parameters are studied. Some typical spectrum distribution of the simulated speech signals are given. Moreover, hoarse speech signals of some typical cases are analyzed by the methods of digital signal processing, including FFT, LPC, Cepstrum technique, Pseudocolor encoding, etc. The experiment results show that the three mass model analysis of vocal cords is an efficient method for analysis of hoarse speech signals.
基金the National Natural Science Foundation of China (No.60071029)
文摘The perceptual effect of the phase information in speech has been studied by auditorysubjective tests. On the condition that the phase spectrum in speech is changed while amplitudespectrum is unchanged, the tests show that: (1) If the envelop of the reconstructed speech signalis unchanged, there is indistinctive auditory perception between the original speech and thereconstructed speech; (2) The auditory perception effect of the reconstructed speech mainly lieson the amplitude of the derivative of the additive phase; (3) td is the maximum relative time shiftbetween different frequency components of the reconstructed speech signal. The speech qualityis excellent while td <10ms; good while 10ms< td <20ms; common while 20ms< td <35ms, andpoor while td >35ms.
基金Project(17KJB510029)supported by the Natural Science Foundation of the Jiangsu Higher Education Institutions,ChinaProject(GXL2017004)supported by the Scientific Research Foundation of Nanjing Forestry University,China+3 种基金Project(202102210132)supported by the Important Project of Science and Technology of Henan Province,ChinaProject(B2019-51)supported by the Scientific Research Foundation of Henan Polytechnic University,ChinaProject(51521003)supported by the Foundation for Innovative Research Groups of the National Natural Science Foundation of ChinaProject(KQTD2016112515134654)supported by Shenzhen Science and Technology Program,China。
文摘A filter algorithm based on cochlear mechanics and neuron filter mechanism is proposed from the view point of vibration.It helps to solve the problem that the non-linear amplification is rarely considered in studying the auditory filters.A cochlear mechanical transduction model is built to illustrate the audio signals processing procedure in cochlea,and then the neuron filter mechanism is modeled to indirectly obtain the outputs with the cochlear properties of frequency tuning and non-linear amplification.The mathematic description of the proposed algorithm is derived by the two models.The parameter space,the parameter selection rules and the error correction of the proposed algorithm are discussed.The unit impulse responses in the time domain and the frequency domain are simulated and compared to probe into the characteristics of the proposed algorithm.Then a 24-channel filter bank is built based on the proposed algorithm and applied to the enhancements of the audio signals.The experiments and comparisons verify that,the proposed algorithm can effectively divide the audio signals into different frequencies,significantly enhance the high frequency parts,and provide positive impacts on the performance of speech enhancement in different noise environments,especially for the babble noise and the volvo noise.
基金Supported by Tianjin Municipal Science and Technology Commission (No.09JCYBJC02200)
文摘Speech signals in frequency domain were separated based on discrete wavelet transform (DWT) and independent component analysis (ICA). First, mixed speech signals were decomposed into different frequency domains by DWT and the subbands of speech signals were separated using ICA in each wavelet domain; then, the permutation and scaling problems of frequency domain blind source separation (BSS) were solved by utilizing the correlation between adjacent bins in speech signals; at last, source signals were reconstructed from single branches. Experiments were carried out with 2 sources and 6 microphones using speech signals at sampling rate of 40 kHz. The microphones were aligned with 2 sources in front of them, on the left and right. The separation of one male and one female speeches lasted 2.5 s. It is proved that the new method is better than single ICA method and the signal to noise ratio is improved by 1 dB approximately.
基金supported by the Second Stage of Brain Korea 21 Projects
文摘In conventional source-filter models, voiced and unvoiced components were considered independently. However, in practice it was difficult to separate the source into two parts. An actual source consists of a mixture of two sources and the ratio varies according to the content or the intention of speaker. It had been investigated to separate the voiced and unvoiced components for different source models. Source signals were modeled based on the residual signal measured from inverse filtering. Three different source models were assumed. The parameters of each model were optimized for the original speech signal using a genetic algorithm. The resulting parameters were compared in terms of the mel-cepstral distance to the original signal, the spectrogram and the spectral envelope from the synthesized signal. The optimization method achieves an improvement of 15% for the Klatt model, but there is little improvement in the modified residual case.
基金the Natural Science Foundation of Jiangsu Province (No.BK2004150)the National 863 Key Project (No.2006AA010102).
文摘In this paper, a Covert Speech Telephone (CST) is designed and implemented based on the information hiding technique, which works on the internet. To solve the large embedding capacity problem for real-time information hiding, a steganographic system combined with a watermarking scheme is proposed, which skillfully transfers the secret speech into watermarking information. The basic idea is to use the speech recognition to significantly reduce the size of information that has to be transmitted in a hidden way. Furthermore, an improved DFT watermarking scheme is proposed which adaptively chooses the embedding locations and applies the multi-ary modulation technique. Based on the GUI (Graphical User Interface) software, the CST operates on both ordinary and secure mode. It is a completely digital system with high speech quality. Objective and subjective tests show that the CST is robust against normal signal processing attacks and steganalysis. The proposed scheme can be used in terms of military applications.
基金Sponsored by the National Natural Science Foundation of China(Grant No.60475016)the Foundational Research Fund of Harbin Engineering University (Grant No.HEUF04092)
文摘To capture the presence of speech embedded in nonspeech events and background noise in shortwave non-cooperative communication, an algorithm for speech-stream detection in noisy environments is presented based on Empirical Mode Decomposition (EMD) and statistical properties of higher-order cumulants of speech signals. With the EMD, the noise signals can be decomposed into different numbers of IMFs. Then, the fourth-order cumulant ( FOC ) can be used to extract the desired feature of statistical properties for IMF components. Since the higher-order eumulants are blind for Gaussian signals, the proposed method is especially effective regarding the problem of speech-stream detection, where the speech signal is distorted by Gaussian noise. With the self-adaptive decomposition by EMD, the proposed method can also work well for non-Gaussian noise. The experiments show that the proposed algorithm can suppress different noise types with different SNRs, and the algorithm is robust in real signal tests.
文摘This research studies the features of chest and abdominal breathing in Zhuang language.Two participants were recruited to record 30 news articles of Zhuang language.The chest and abdominal breathing signals as well as speech signal were recorded simultaneously. Programs for breathing analysis have been written to extract parameters such as breathing reset amplitude, time of inhale phase, and slope of exhale phase. The results show that the times of inhale and exhale reset of abdominal breathing are earlier than chest breathing, the breathing reset is related to prosodic boundaries
基金Supported by the National Natural Science Foundation of China (No.10371055).
文摘The traditional correlation-based detector is optimal only for Gaussian data, but the Laplacian Probability Density Function (PDF) is more appropriate to model the coefficients in the Discrete Ridgelet Transform (DRT) domain. An additive maximum-likelihood detector based on the Laplacian PDF is analyzed and the theoretical result of its performance is given. The experiments show that the error of the Laplacian model for the DRT coefficients of many images is smaller than that of the Gaussian model. The experiments also prove that the Laplacian detector is superior to the tradi- tional correlation-based detector.
基金Project(61072087) supported by the National Natural Science Foundation of ChinaProject(2011-035) supported by Shanxi Province Scholarship Foundation, China+2 种基金Project(20120010) supported by Universities High-tech Foundation Projects, ChinaProject (2013021016-1) supported by the Youth Science and Technology Foundation of Shanxi Province, ChinaProjects(2013011016-1, 2012011014-1) supported by the Natural Science Foundation of Shanxi Province, China
文摘Enhanced speech based on the traditional wavelet threshold function had auditory oscillation distortion and the low signal-to-noise ratio (SNR). In order to solve these problems, a new continuous differentiable threshold function for speech enhancement was presented. Firstly, the function adopted narrow threshold areas, preserved the smaller signal speech, and improved the speech quality; secondly, based on the properties of the continuous differentiable and non-fixed deviation, each area function was attained gradually by using the method of mathematical derivation. It ensured that enhanced speech was continuous and smooth; it removed the auditory oscillation distortion; finally, combined with the Bark wavelet packets, it further improved human auditory perception. Experimental results show that the segmental SNR and PESQ (perceptual evaluation of speech quality) of the enhanced speech using this method increase effectively, compared with the existing speech enhancement algorithms based on wavelet threshold.
文摘Sample entropy can reflect the change of level of new information in signal sequence as well as the size of the new information. Based on the sample entropy as the features of speech classification, the paper firstly extract the sample entropy of mixed signal, mean and variance to calculate each signal sample entropy, finally uses the K mean clustering to recognize. The simulation results show that: the recognition rate can be increased to 89.2% based on sample entropy.
文摘To develop a more robust endpoint detection algorithm, this paper first proposes a fuzzy adaptive smoothing algorithm. The general idea underlying adaptive smoothing is to adapt the short-term sub-band mean of the amplitude to the local attributes of speech on the basis of discontinuity measures. The adaptive smoothing algorithm in this paper utilizes a scale-space framework through the minimal description length (MDL). We recommend using the fuzzy muhi-attribute decision making approach to select the proper sub-bands where the word boundary can be more reliably detected. The process and simulation of the fuzzy adaptive smoothing algorithm are given. The parameters utilize the mean amplitude of the audible frequency range (300 -3 700 Hz) and the sub-band mean of the amplitude (16 band filter-bank). We selected the audible band energy because of its usefulness in detecting high-energy regions and making the distinction between speech and noise. Otherwise, the fuzzy adaptive smoothing algorithm is processed in sub-band speech to utilize the full range of frequency information.
文摘This letter investigates an improved blind source separation algorithm based on Maximum Entropy (ME) criteria. The original ME algorithm chooses the fixed exponential or sigmoid ftmction as the nonlinear mapping function which can not match the original signal very well. A parameter estimation method is employed in this letter to approach the probability of density function of any signal with parameter-steered generalized exponential function. An improved learning rule and a natural gradient update formula of unmixing matrix are also presented. The algorithm of this letter can separate the mixture of super-Gaussian signals and also the mixture of sub-Gaussian signals. The simulation experiment demonstrates the efficiency of the algorithm.
基金National Natural Science Foundation of Chinagrant number:61271082,61201029,61102094
文摘In this paper, we applied RobustICA to speech separation and made a comprehensive comparison to FastICA according to the separation results. Through a series of speech signal separation test, RobustICA reduced the separation time consumed by FastICA with higher stability, and speeches separated by RobustICA were proved to having lower separation errors. In the 14 groups of speech separation tests, separation time consumed by RobustICA was 3.185 s less than FastICA by nearly 68%. Separation errors of FastICA had a float between 0.004 and 0.02, while the errors of RobustlCA remained around 0.003. Furthermore, compared to FastICA, RobustlCA showed better separation robustness. Experimental results showed that RohustICA was successful to apply to the speech signal separation, and showed superiority to FastlCA in speech separation.
文摘Based on W-disjoint orthogonality of speech mixtures, a space d,scnmlnative tunetlon was proposer1 to enumerate and localize competing speakers in the surrounding environments. Then, a Wiener-like postfiherer was developed to adaptively suppress interferences. Experimental results with a hands-free speech recognizer under various SNR and competing speakers settings show that nearly 69 % error reduction can be obtained with a two-channel small aperture microphone array against the conventional single microphone baseline system. Comparisons were made against traditional delay-and-sum and Griffiths-Jim adaptive beamforming techniques to further assess the effectiveness of this method.
基金Supported by the National Natural Science Foundation of China(No.60472058,60975017)the Fundamental Research Funds for the Central Universities(No.2009B32614,2009B32414)
文摘Based on the approximate sparseness of speech in wavelet basis,a compressed sensing theory is applied to compress and reconstruct speech signals.Compared with one-dimensional orthogonal wavelet transform(OWT),two-dimensional OWT combined with Dmeyer and biorthogonal wavelet is firstly proposed to raise running efficiency in speech frame processing,furthermore,the threshold is set to improve the sparseness.Then an adaptive subgradient projection method(ASPM)is adopted for speech reconstruction in compressed sensing.Meanwhile,mechanism which adaptively adjusts inflation parameter in different iterations has been designed for fast convergence.Theoretical analysis and simulation results conclude that this algorithm has fast convergence,and lower reconstruction error,and also exhibits higher robustness in different noise intensities.