摘要
Single-channel separation (SCS) is a challenging scenario where the objective is to segregate speaker signals from their mixture with high accuracy. In this research a novel framework called subband perceptually weighted transformation (SPWT) is developed to offer a perceptually relevant feature to replace the commonly used magnitude of the short-time Fourier transform (STFT). The main objectives of the proposed SPWT are to lower the spectral distortion (SD) and to improve the ideal separation quality. The performance of the SPWT is compared to those obtained using mixmax and Wiener filter methods. A comprehensive statistical analysis is conducted to compare the SPWT quantization performance as well as the ideal separation quality with other features of log-spectrum and magnitude spectrum. Our evaluations show that the SPWT provides lower SD values and a more compact distribution of SD,leading to more acceptable subjective separation quality as evaluated using the mean opinion score.
Single-channel separation (SCS) is a challenging scenario where the objective is to segregate speaker signals from their mixture with high accuracy. In this research a novel framework called subband perceptually weighted transformation (SPWT) is developed to offer a perceptually relevant feature to replace the commonly used magnitude of the short-time Fourier transform (STFT). The main objectives of the proposed SPWT are to lower the spectral distortion (SD) and to improve the ideal separation quality. The performance of the SPWT is compared to those obtained using mixmax and Wiener filter methods. A comprehensive statistical analysis is conducted to compare the SPWT quantization performance as well as the ideal separation quality with other features of log-spectrum and magnitude spectrum. Our evaluations show that the SPWT provides lower SD values and a more compact distribution of SD, leading to more acceptable subjective separation quality as evaluated using the mean opinion score.