A Bark-band residual noise model integrated with the human hearing mechanism is proposed to efficiently complement sinusoidal model in parametric audio coding. The time-varying spectrum of the residual noise is retrie...A Bark-band residual noise model integrated with the human hearing mechanism is proposed to efficiently complement sinusoidal model in parametric audio coding. The time-varying spectrum of the residual noise is retrieved by Bark-scale piecewise constant magnitude estimates along with random phases. In the proposed noise model, Bark bands information is obtained by short-time FFT method and window overlap-add technique is exploited to remove boundary discontinuities. SVQ is also incorporated into parameter quantization process for the low bit-rate coding demand. Simulation results and informal listening tests show that when the sinusoidal model is combined with the Bark-band noise model, better synthesis audio quality can be achieved compared with the original sinusoidal modeling audio codec.展开更多
This paper proposed improvements to the low bit rate parametric audio coder with sinusoid model as its kernel. Firstly, we propose a new method to effectively order and select the perceptually most important sinusoids...This paper proposed improvements to the low bit rate parametric audio coder with sinusoid model as its kernel. Firstly, we propose a new method to effectively order and select the perceptually most important sinusoids. The sinusoid which contributes most to the reduction of overall NMR is chosen. Combined with our improved parametric psychoacoustic model and advanced peak riddling techniques, the number of sinusoids required can be greatly reduced and the coding efficiency can be greatly enhanced. A lightweight version is also given to reduce the amount of computation with only little sacrifice of performance. Secondly, we propose two enhancement techniques for sinusoid synthesis: bandwidth enhancement and line enhancement. With little overhead, the effective bandwidth can be extended one more octave; the timbre tends to sound much brighter, thicker and more beautiful.展开更多
This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is descri...This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is described. The paper presents essentially a fundamental enhancement to the sinusoidal modeling component. The enhancement involves an audio signal scheme based on carrying out overlap-add sinusoidal modeling at three successive time scales, large, medium, and small. The sinusoidal modeling is done in an analysis-by-synthesis overlap- add manner across the three scales by using a psychoacoustically weighted matching pursuits. The sinusoidal modeling residual at the first scale is passed to the smaller scales to allow for the modeling of various signal features at appropriate resolutions.This approach greatly helps to correct the pre-echo inherent in the sinusoidal model. This improves the perceptual audio quality upon our previous work of sinusoidal modeling while using tile same number of sinusoids. Tile most obvious application for the SN model is in scalable, high fidelity audio coding and signal modification.展开更多
文摘A Bark-band residual noise model integrated with the human hearing mechanism is proposed to efficiently complement sinusoidal model in parametric audio coding. The time-varying spectrum of the residual noise is retrieved by Bark-scale piecewise constant magnitude estimates along with random phases. In the proposed noise model, Bark bands information is obtained by short-time FFT method and window overlap-add technique is exploited to remove boundary discontinuities. SVQ is also incorporated into parameter quantization process for the low bit-rate coding demand. Simulation results and informal listening tests show that when the sinusoidal model is combined with the Bark-band noise model, better synthesis audio quality can be achieved compared with the original sinusoidal modeling audio codec.
文摘This paper proposed improvements to the low bit rate parametric audio coder with sinusoid model as its kernel. Firstly, we propose a new method to effectively order and select the perceptually most important sinusoids. The sinusoid which contributes most to the reduction of overall NMR is chosen. Combined with our improved parametric psychoacoustic model and advanced peak riddling techniques, the number of sinusoids required can be greatly reduced and the coding efficiency can be greatly enhanced. A lightweight version is also given to reduce the amount of computation with only little sacrifice of performance. Secondly, we propose two enhancement techniques for sinusoid synthesis: bandwidth enhancement and line enhancement. With little overhead, the effective bandwidth can be extended one more octave; the timbre tends to sound much brighter, thicker and more beautiful.
基金Supported by the National Natural Science Foundation of China(No.69802007)Motorola China Research Center(No.B38300)Natural Science Foundation of Guangdong(No.011611)
文摘This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is described. The paper presents essentially a fundamental enhancement to the sinusoidal modeling component. The enhancement involves an audio signal scheme based on carrying out overlap-add sinusoidal modeling at three successive time scales, large, medium, and small. The sinusoidal modeling is done in an analysis-by-synthesis overlap- add manner across the three scales by using a psychoacoustically weighted matching pursuits. The sinusoidal modeling residual at the first scale is passed to the smaller scales to allow for the modeling of various signal features at appropriate resolutions.This approach greatly helps to correct the pre-echo inherent in the sinusoidal model. This improves the perceptual audio quality upon our previous work of sinusoidal modeling while using tile same number of sinusoids. Tile most obvious application for the SN model is in scalable, high fidelity audio coding and signal modification.