Structural and statistical characteristics of signals can improve the performance of Compressed Sensing (CS). Two kinds of features of Discrete Cosine Transform (DCT) coefficients of voiced speech signals are discusse...Structural and statistical characteristics of signals can improve the performance of Compressed Sensing (CS). Two kinds of features of Discrete Cosine Transform (DCT) coefficients of voiced speech signals are discussed in this paper. The first one is the block sparsity of DCT coefficients of voiced speech formulated from two different aspects which are the distribution of the DCT coefficients of voiced speech and the comparison of reconstruction performance between the mixed program and Basis Pursuit (BP). The block sparsity of DCT coefficients of voiced speech means that some algorithms of block-sparse CS can be used to improve the recovery performance of speech signals. It is proved by the simulation results of the mixed program which is an improved version of the mixed program. The second one is the well known large DCT coefficients of voiced speech focus on low frequency. In line with this feature, a special Gaussian and Partial Identity Joint (GPIJ) matrix is constructed as the sensing matrix for voiced speech signals. Simulation results show that the GPIJ matrix outperforms the classical Gaussian matrix for speech signals of male and female adults.展开更多
In this study the mechanical version of the three-disk Tower of London task with changes in the movements was conducted by fifteen elderly participants with concurrent articulatory suppression. Also, this executive ta...In this study the mechanical version of the three-disk Tower of London task with changes in the movements was conducted by fifteen elderly participants with concurrent articulatory suppression. Also, this executive task was conducted without verbal secondary task and the results of these two states were com- pared with each other. From this comparison, got evidences based on inner speech role in more complicated Tower of London tasks, although in general, the results suggest a more outstanding role of inner scribe in spatial planning in this executive task. Then inner speech and inner scribe roles have been described in Tower of London task applying “Baddeley and Logie” working memory model.展开更多
Compressed sensing,a new area of signal processing rising in recent years,seeks to minimize the number of samples that is necessary to be taken from a signal for precise reconstruction.The precondition of compressed s...Compressed sensing,a new area of signal processing rising in recent years,seeks to minimize the number of samples that is necessary to be taken from a signal for precise reconstruction.The precondition of compressed sensing theory is the sparsity of signals.In this paper,two methods to estimate the sparsity level of the signal are formulated.And then an approach to estimate the sparsity level directly from the noisy signal is presented.Moreover,a scheme based on distributed compressed sensing for speech signal denoising is described in this work which exploits multiple measurements of the noisy speech signal to construct the block-sparse data and then reconstruct the original speech signal using block-sparse model-based Compressive Sampling Matching Pursuit(CoSaMP) algorithm.Several simulation results demonstrate the accuracy of the estimated sparsity level and that this de-noising system for noisy speech signals can achieve favorable performance especially when speech signals suffer severe noise.展开更多
In the present study, the ROCF test was initially conducted involving 30 healthy young individuals, in a quiet environment as Experiment 1 to examine variations in the score among different methods to memorize the fig...In the present study, the ROCF test was initially conducted involving 30 healthy young individuals, in a quiet environment as Experiment 1 to examine variations in the score among different methods to memorize the figure. In such an environment, no significant differences were observed in the score between the copying and outer speech groups, which suggested the possibility of some of the former groups having used outer speech in a voice too low to be heard or moving their lips without vocalization, achieving the same effect as outer speech, and consequently leading to the absence of differences from the outer speech group. On the other hand, the score markedly varied between the mouthpiece and copying or outer speech groups. As lip movements were suppressed in the former case, the unconscious use of outer speech was also prevented, possibly leading to poor results. Based on these findings, it may be possible to enhance the effects of rehabilitation in a clinical setting by promoting patients’ memorization using outer speech to vocalize the contents of training.展开更多
Speech emotion recognition (SER) in noisy environment is a vital issue in artificial intelligence (AI). In this paper, the reconstruction of speech samples removes the added noise. Acoustic features extracted from...Speech emotion recognition (SER) in noisy environment is a vital issue in artificial intelligence (AI). In this paper, the reconstruction of speech samples removes the added noise. Acoustic features extracted from the reconstructed samples are selected to build an optimal feature subset with better emotional recognizability. A multiple-kernel (MK) support vector machine (SVM) classifier solved by semi-definite programming (SDP) is adopted in SER procedure. The proposed method in this paper is demonstrated on Berlin Database of Emotional Speech. Recognition accuracies of the original, noisy, and reconstructed samples classified by both single-kernel (SK) and MK classifiers are compared and analyzed. The experimental results show that the proposed method is effective and robust when noise exists.展开更多
基金Supported by the National Natural Science Foundation of China (No. 60971129)the National Research Program of China (973 Program) (No. 2011CB302303)the Scientific Innovation Research Program of College Graduate in Jiangsu Province (No. CXLX11_0408)
文摘Structural and statistical characteristics of signals can improve the performance of Compressed Sensing (CS). Two kinds of features of Discrete Cosine Transform (DCT) coefficients of voiced speech signals are discussed in this paper. The first one is the block sparsity of DCT coefficients of voiced speech formulated from two different aspects which are the distribution of the DCT coefficients of voiced speech and the comparison of reconstruction performance between the mixed program and Basis Pursuit (BP). The block sparsity of DCT coefficients of voiced speech means that some algorithms of block-sparse CS can be used to improve the recovery performance of speech signals. It is proved by the simulation results of the mixed program which is an improved version of the mixed program. The second one is the well known large DCT coefficients of voiced speech focus on low frequency. In line with this feature, a special Gaussian and Partial Identity Joint (GPIJ) matrix is constructed as the sensing matrix for voiced speech signals. Simulation results show that the GPIJ matrix outperforms the classical Gaussian matrix for speech signals of male and female adults.
文摘In this study the mechanical version of the three-disk Tower of London task with changes in the movements was conducted by fifteen elderly participants with concurrent articulatory suppression. Also, this executive task was conducted without verbal secondary task and the results of these two states were com- pared with each other. From this comparison, got evidences based on inner speech role in more complicated Tower of London tasks, although in general, the results suggest a more outstanding role of inner scribe in spatial planning in this executive task. Then inner speech and inner scribe roles have been described in Tower of London task applying “Baddeley and Logie” working memory model.
基金Supported by the National Natural Science Foundation of China (No. 60971129)the National Research Program of China (973 Program) (No. 2011CB302303)the Scientific Innovation Research Program of College Graduate in Jiangsu Province (No. CXLX11_0408)
文摘Compressed sensing,a new area of signal processing rising in recent years,seeks to minimize the number of samples that is necessary to be taken from a signal for precise reconstruction.The precondition of compressed sensing theory is the sparsity of signals.In this paper,two methods to estimate the sparsity level of the signal are formulated.And then an approach to estimate the sparsity level directly from the noisy signal is presented.Moreover,a scheme based on distributed compressed sensing for speech signal denoising is described in this work which exploits multiple measurements of the noisy speech signal to construct the block-sparse data and then reconstruct the original speech signal using block-sparse model-based Compressive Sampling Matching Pursuit(CoSaMP) algorithm.Several simulation results demonstrate the accuracy of the estimated sparsity level and that this de-noising system for noisy speech signals can achieve favorable performance especially when speech signals suffer severe noise.
文摘In the present study, the ROCF test was initially conducted involving 30 healthy young individuals, in a quiet environment as Experiment 1 to examine variations in the score among different methods to memorize the figure. In such an environment, no significant differences were observed in the score between the copying and outer speech groups, which suggested the possibility of some of the former groups having used outer speech in a voice too low to be heard or moving their lips without vocalization, achieving the same effect as outer speech, and consequently leading to the absence of differences from the outer speech group. On the other hand, the score markedly varied between the mouthpiece and copying or outer speech groups. As lip movements were suppressed in the former case, the unconscious use of outer speech was also prevented, possibly leading to poor results. Based on these findings, it may be possible to enhance the effects of rehabilitation in a clinical setting by promoting patients’ memorization using outer speech to vocalize the contents of training.
基金supported by the National Natural Science Foundation of China (61501204,61601198)the Hebei Province Natural Science Foundation (E2016202341)+2 种基金the Hebei Province Foundation for Returned Scholars (C2012003038)the Shandong Province Natural Science Foundation (ZR2015FL010)the Science and Technology Program of University of Jinan (XKY1710)
文摘Speech emotion recognition (SER) in noisy environment is a vital issue in artificial intelligence (AI). In this paper, the reconstruction of speech samples removes the added noise. Acoustic features extracted from the reconstructed samples are selected to build an optimal feature subset with better emotional recognizability. A multiple-kernel (MK) support vector machine (SVM) classifier solved by semi-definite programming (SDP) is adopted in SER procedure. The proposed method in this paper is demonstrated on Berlin Database of Emotional Speech. Recognition accuracies of the original, noisy, and reconstructed samples classified by both single-kernel (SK) and MK classifiers are compared and analyzed. The experimental results show that the proposed method is effective and robust when noise exists.