The growing number of mobile users, as well as the diversification in types of services have resulted in increasing demands for wireless network bandwidth in recent years. Although evolving transmission techniques are...The growing number of mobile users, as well as the diversification in types of services have resulted in increasing demands for wireless network bandwidth in recent years. Although evolving transmission techniques are able to enlarge the network capacity to some degree, they still cannot satisfy the requirements of mobile users. Meanwhile, following Moore's Law, the data processing capabilities of mobile user terminals are continuously improving. In this paper, we explore possible methods of trading strong computational power at wireless terminals for transmission efficiency of communications. Taking the specific scenario of wireless video conversation, we propose a model-based video coding scheme by learning the structures in multimedia contents. Benefiting from both strong computing capability and pre-learned model priors, only low-dimensional parameters need to be transmitted; and the intact multimedia contents can also be reconstructed at the receivers in real-time. Experiment results indicate that, compared to conventional video codecs, the proposed scheme significantly reduces the data rate with the aid of computational capability at wireless terminals.展开更多
This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is descri...This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is described. The paper presents essentially a fundamental enhancement to the sinusoidal modeling component. The enhancement involves an audio signal scheme based on carrying out overlap-add sinusoidal modeling at three successive time scales, large, medium, and small. The sinusoidal modeling is done in an analysis-by-synthesis overlap- add manner across the three scales by using a psychoacoustically weighted matching pursuits. The sinusoidal modeling residual at the first scale is passed to the smaller scales to allow for the modeling of various signal features at appropriate resolutions.This approach greatly helps to correct the pre-echo inherent in the sinusoidal model. This improves the perceptual audio quality upon our previous work of sinusoidal modeling while using tile same number of sinusoids. Tile most obvious application for the SN model is in scalable, high fidelity audio coding and signal modification.展开更多
基金supported by the National Basic Research Project of China (973) (2013CB329006)National Natural Science Foundation of China (NSFC, 61101071,61471220, 61021001)Tsinghua University Initiative Scientific Research Program
文摘The growing number of mobile users, as well as the diversification in types of services have resulted in increasing demands for wireless network bandwidth in recent years. Although evolving transmission techniques are able to enlarge the network capacity to some degree, they still cannot satisfy the requirements of mobile users. Meanwhile, following Moore's Law, the data processing capabilities of mobile user terminals are continuously improving. In this paper, we explore possible methods of trading strong computational power at wireless terminals for transmission efficiency of communications. Taking the specific scenario of wireless video conversation, we propose a model-based video coding scheme by learning the structures in multimedia contents. Benefiting from both strong computing capability and pre-learned model priors, only low-dimensional parameters need to be transmitted; and the intact multimedia contents can also be reconstructed at the receivers in real-time. Experiment results indicate that, compared to conventional video codecs, the proposed scheme significantly reduces the data rate with the aid of computational capability at wireless terminals.
基金Supported by the National Natural Science Foundation of China(No.69802007)Motorola China Research Center(No.B38300)Natural Science Foundation of Guangdong(No.011611)
文摘This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is described. The paper presents essentially a fundamental enhancement to the sinusoidal modeling component. The enhancement involves an audio signal scheme based on carrying out overlap-add sinusoidal modeling at three successive time scales, large, medium, and small. The sinusoidal modeling is done in an analysis-by-synthesis overlap- add manner across the three scales by using a psychoacoustically weighted matching pursuits. The sinusoidal modeling residual at the first scale is passed to the smaller scales to allow for the modeling of various signal features at appropriate resolutions.This approach greatly helps to correct the pre-echo inherent in the sinusoidal model. This improves the perceptual audio quality upon our previous work of sinusoidal modeling while using tile same number of sinusoids. Tile most obvious application for the SN model is in scalable, high fidelity audio coding and signal modification.