期刊文献+

Speech enhancement with a GSC-like structure employing sparse coding

Speech enhancement with a GSC-like structure employing sparse coding
原文传递
导出
摘要 Speech communication is often influenced by various types of interfering signals. To improve the quality of the desired signal, a generalized sidelobe canceller(GSC), which uses a reference signal to estimate the interfering signal, is attracting attention of researchers. However, the interference suppression of GSC is limited since a little residual desired signal leaks into the reference signal. To overcome this problem, we use sparse coding to suppress the residual desired signal while preserving the reference signal. Sparse coding with the learned dictionary is usually used to reconstruct the desired signal. As the training samples of a desired signal for dictionary learning are not observable in the real environment, the reconstructed desired signal may contain a lot of residual interfering signal. In contrast,the training samples of the interfering signal during the absence of the desired signal for interferer dictionary learning can be achieved through voice activity detection(VAD). Since the reference signal of an interfering signal is coherent to the interferer dictionary, it can be well restructured by sparse coding, while the residual desired signal will be removed. The performance of GSC will be improved since the estimate of the interfering signal with the proposed reference signal is more accurate than ever. Simulation and experiments on a real acoustic environment show that our proposed method is effective in suppressing interfering signals. Speech comnmnication is often influenced by various types of interfering signals. To improve the quality of the desired signal, a generalized sidelobe canceller (GSC), which uses a reference signal to estimate the interfering signal, is attracting attention of researchers. However, the interference suppression of GSC is limited since a little residual desired signal leaks into the reference signal. To overcome this problem, we use sparse coding to suppress the residual desired signal while preserving the reference signal. Sparse coding with the learned dictionary is usually used to reconstruct the desired signal. As the training samples of a desired signal for dictionary learning are not observable in the real environment, the reconstructed desired signal may contain a lot of residual interfering signal. In contrast, the training samples of the interfering signal during the absence of the desired signal for interferer dictionary learning can be achieved through voice activity detection (VAD). Since the reference signal of an interfering signal is coherent to the interferer dictionary, it can be well restructured by sparse coding, while the residual desired signal will be removed. The performance of GSC will be improved since the estimate of the interfering signal with the proposed reference signal is more accurate than ever. Simulation and experiments on a real acoustic environment show that our proposed method is effective in suppressing interfering signals.
出处 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2014年第12期1154-1163,共10页 浙江大学学报C辑(计算机与电子(英文版)
基金 Project supported by the National Basic Research Program(973)of China(No.2012CB316400) the National NaturalScience Foundation of China(No.61171151)
关键词 Generalized sidelobe canceller Speech enhancement Voice activity detection Dictionary learning Sparse coding Generalized sidelobe canceller, Speech enhancement, Voice activity detection, Dictionary learning Sparse coding
  • 相关文献

参考文献25

  • 1Martin, R., 2006. Bias compensation methods for minimum statistics noise power spectral density estimation. Signal Process., 86(6):1215-1229. [doi:1O.1016/j.sigpro. 2005.07.037].
  • 2Herbordt, W., Kellermann, W., 2001. Efficient frequencydomain realization of robust generalized sidelobe cancellers. IEEE 4th Workshop on Multimedia Signal Processing, p.377-382. [doi:1O.1109/MMSP.2001.962763j.
  • 3Aharon, A.M., Elad, M., 2006. K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process., 54(1l):4:nl-4322. Idoi:l0.l109/TSP.2006.881199j.
  • 4Plumbley, M.D., Blumensath, T., Daudet, L., et aI., 2010. Sparse representations in audio and music: from coding to source separation. Proc. JEEE, 98(6):995-1005. [doi:1O.1109/ .JPROC.2009.2030345j.
  • 5Hoshuyarna, 0., Sugiyama, A., Hirano, A., 1999. A robust adaptive beam former for microphone arrays with a blocking matrix using constrained adaptive filters. IEEE Trans. Signal Process., 47(10):2677-2684. [doi: 10.1109/78. 790650j.
  • 6Tanyer, S.G., Ozer, H., 2000. Voice activity detection in nonstationary noise. IEEE Trans. Speech Audio Process., 8( 4):478-482. [doi:1O.1109/89.848229].
  • 7Talmon, R., Cohen, 1., Gannot, S., 2009. Convolutive transfer function generalized sidelobe canceler. IEEE Trans. Audio Speech Lang. Process., 17(7):1420-1434. [doi: 10.1109 /TASL.2009.2020891j.
  • 8Skretting, K., Engan, K., 2010. Recursive least squares dictionary learning algorithm. IEEE Trans. Signal Process.,58(4):2121-2130. [doi:1O.1109/TSP.2010.2040671].
  • 9Eshaghi, M., Karami Mollaei, M., 2010. Voice activity detection based on using wavelet packet. Dig. Signal Process., 20(4): 1102-1115. [doi:lO.lO lG/j.dsp.2009.11.008j.
  • 10Gemmeke, J.F., Cranen, B., 2009. Sparse imputation for noise robust speech recognition using soft masks. IEEE Int. Conf', on Acoustics, Speech and Signal Processing, p.4645-4648. [doi:l0.1109/ICASSP.2009.4960666j.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部