期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Evaluating single-channel speech separation performance in transform-domain 被引量:1
1
作者 Pejman MOWLAEE Abolghasem SAYADIYAN hamid sheikhzadeh 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2010年第3期160-174,共15页
Single-channel separation (SCS) is a challenging scenario where the objective is to segregate speaker signals from their mixture with high accuracy. In this research a novel framework called subband perceptually weigh... Single-channel separation (SCS) is a challenging scenario where the objective is to segregate speaker signals from their mixture with high accuracy. In this research a novel framework called subband perceptually weighted transformation (SPWT) is developed to offer a perceptually relevant feature to replace the commonly used magnitude of the short-time Fourier transform (STFT). The main objectives of the proposed SPWT are to lower the spectral distortion (SD) and to improve the ideal separation quality. The performance of the SPWT is compared to those obtained using mixmax and Wiener filter methods. A comprehensive statistical analysis is conducted to compare the SPWT quantization performance as well as the ideal separation quality with other features of log-spectrum and magnitude spectrum. Our evaluations show that the SPWT provides lower SD values and a more compact distribution of SD,leading to more acceptable subjective separation quality as evaluated using the mean opinion score. 展开更多
关键词 Single-channel separation (SCS) Magnitude spectrum Vector quantization (VQ) Subband perceptually weightedtransformation (SPWT) Spectral distortion (SD)
原文传递
Split vector quantization for sinusoidal amplitude and frequency
2
作者 Pejman MOWLAEE Abolghasem SAYADIAN hamid sheikhzadeh 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2011年第2期140-154,共15页
In this paper, we suggest applying tree structure on the sinusoidal parameters. The suggested sinusoidal coder is targeted to find the coded sinusoidal parameters obtained by minimizing a likelihood function in a leas... In this paper, we suggest applying tree structure on the sinusoidal parameters. The suggested sinusoidal coder is targeted to find the coded sinusoidal parameters obtained by minimizing a likelihood function in a least square (LS) sense. From a rate-distortion standpoint, we address the problem of how to allocate available bits among different frequency bands to code sinusoids at each frame. For further analyzing the quantization behavior of the proposed method, we assess the quantization performance with respect to other methods: the short-time Fourier transform (STFT) based coder commonly used for speech enhancement or separation, and the line spectral frequency (LSF) coder used in speech coding. Through extensive simulations, we show that the proposed quantizer leads to less spectral distortion as well as higher perceived quality for the re-synthesized signals based on the coded parameters in a model-based approach with respect to previous STFT-based methods. The proposed method lowers the complexity, and, due to its tree-structure, leads to a rapid search capability. It provides flexibility for use in many speaker-independent applications by finding the most likely frequency vectors selected from a list of frequency candidates. Therefore, the proposed quantizer can be considered an attractive candidate for model-based speech applications in both speaker-dependent and speaker-independent scenarios. 展开更多
关键词 Short-time Fourier transform Split vector quantization Sinusoidal modeling Spectral distortion
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部