Accurate endpoint detection is a necessary capability for speech recognition. A new energy measure method based on the empirical mode decomposition (EMD) algorithm and Teager energy operator (TEO) is proposed to l...Accurate endpoint detection is a necessary capability for speech recognition. A new energy measure method based on the empirical mode decomposition (EMD) algorithm and Teager energy operator (TEO) is proposed to locate endpoint intervals of a speech signal embedded in noise. With the EMD, the noise signals can be decomposed into different numbers of sub-signals called intrinsic mode functions (IMFs), which is a zero-mean AM-FM component. Then TEO can be used to extract the desired feature of the modulation energy for IMF components. In order to show the effectiveness of the proposed method, examples are presented to show that the new measure is more effective than traditional measures. The present experimental results show that the measure can be used to improve the performance of endpoint detection algorithms and the accuracy of this algorithm is quite satisfactory and acceptable.展开更多
Regarding the performance of traditional endpoint detection algorithms degrades as the environment noise level increases, a recursive calculating algorithm for higher-order cu- mulants over a sliding window is propose...Regarding the performance of traditional endpoint detection algorithms degrades as the environment noise level increases, a recursive calculating algorithm for higher-order cu- mulants over a sliding window is proposed. Then it is applied to the speech endpoint detection. Furthermore, endpoint detection is carried out with the feature of energy. Experimental results show that both the computational efficiency and the robustness against noise of the proposed algorithm are improved remarkably compared with traditional algorithm. The average prob- ability of correct point detection (Pc-point) of the proposed voice activity detection (VAD) is 6.07% higher than that of G.729b VAD in different noisy at different signal-noise ratios (SNRs) environments.展开更多
The Perception Spectrogram Structure Boundary(PSSB)parameter is proposed for speech endpoint detection as a preprocess of speech or speaker recognition.At first a hearing perception speech enhancement is carried out...The Perception Spectrogram Structure Boundary(PSSB)parameter is proposed for speech endpoint detection as a preprocess of speech or speaker recognition.At first a hearing perception speech enhancement is carried out.Then the two-dimensional enhancement is performed upon the sound spectrogram according to the difference between the determinacy distribution characteristic of speech and the random distribution characteristic of noise.Finally a decision for endpoint was made by the PSSB parameter.Experimental results show that,in a low SNR environment from-10 dB to 10 dB,the algorithm proposed in this paper may achieve higher accuracy than the extant endpoint detection algorithms.The detection accuracy of 75.2%can be reached even in the extremely low SNR at-10 dB.Therefore it is suitable for speech endpoint detection in low-SNRs environment.展开更多
基金supported by the National Natural Science Foundation of China under Grant No. 60771033
文摘Accurate endpoint detection is a necessary capability for speech recognition. A new energy measure method based on the empirical mode decomposition (EMD) algorithm and Teager energy operator (TEO) is proposed to locate endpoint intervals of a speech signal embedded in noise. With the EMD, the noise signals can be decomposed into different numbers of sub-signals called intrinsic mode functions (IMFs), which is a zero-mean AM-FM component. Then TEO can be used to extract the desired feature of the modulation energy for IMF components. In order to show the effectiveness of the proposed method, examples are presented to show that the new measure is more effective than traditional measures. The present experimental results show that the measure can be used to improve the performance of endpoint detection algorithms and the accuracy of this algorithm is quite satisfactory and acceptable.
基金supported by the National Natural Science Eoundation of China(61271352)
文摘Regarding the performance of traditional endpoint detection algorithms degrades as the environment noise level increases, a recursive calculating algorithm for higher-order cu- mulants over a sliding window is proposed. Then it is applied to the speech endpoint detection. Furthermore, endpoint detection is carried out with the feature of energy. Experimental results show that both the computational efficiency and the robustness against noise of the proposed algorithm are improved remarkably compared with traditional algorithm. The average prob- ability of correct point detection (Pc-point) of the proposed voice activity detection (VAD) is 6.07% higher than that of G.729b VAD in different noisy at different signal-noise ratios (SNRs) environments.
基金supported by the National Natural Science Foundation of China.(61071215,61271359,61372146)
文摘The Perception Spectrogram Structure Boundary(PSSB)parameter is proposed for speech endpoint detection as a preprocess of speech or speaker recognition.At first a hearing perception speech enhancement is carried out.Then the two-dimensional enhancement is performed upon the sound spectrogram according to the difference between the determinacy distribution characteristic of speech and the random distribution characteristic of noise.Finally a decision for endpoint was made by the PSSB parameter.Experimental results show that,in a low SNR environment from-10 dB to 10 dB,the algorithm proposed in this paper may achieve higher accuracy than the extant endpoint detection algorithms.The detection accuracy of 75.2%can be reached even in the extremely low SNR at-10 dB.Therefore it is suitable for speech endpoint detection in low-SNRs environment.