A novel color compensation method for multi-view video coding (MVC) is proposed, which efficiently exploits the inter-view dependencies between views with the existence of color mismatch caused by the diversity of cam...A novel color compensation method for multi-view video coding (MVC) is proposed, which efficiently exploits the inter-view dependencies between views with the existence of color mismatch caused by the diversity of cameras. A color compensation model is developed in RGB channels and then extended to YCbCr channels for practical use. A modified inter-view reference picture is constructed based on the color compensation model, which is more similar to the coding picture than the original inter-view reference picture. Moreover, the color compensation factors can be derived in both encoder and decoder, therefore no additional data need to be transmitted to the decoder. The experimental results show that the proposed method improves the coding efficiency of MVC and maintains good subjective quality.展开更多
Current multi-view video coding (MVC) reference model in joint video team (JVT) does not provide efficient rate control schemes. This paper presents a rate control algorithm for MVC by improving the quadratic rate...Current multi-view video coding (MVC) reference model in joint video team (JVT) does not provide efficient rate control schemes. This paper presents a rate control algorithm for MVC by improving the quadratic rate-distortion (R-D) model. We reasonably allocate bit-rate among views based on the correlation analysisl The proposed algorithm consists of three levels to control the rate bits more accurately, of which the frame layer allocates bits according to the frame complexity and the temporal activity. Extensive experiments show that the proposed algorithm can control the bit rate efficiently.展开更多
The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduce...The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.展开更多
To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advan...To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.展开更多
This paper proposes an adaptive hybrid forward error correction(AH-FEC)coding scheme for coping with dynamic packet loss events in video and audio transmission.Specifically,the proposed scheme consists of a hybrid Ree...This paper proposes an adaptive hybrid forward error correction(AH-FEC)coding scheme for coping with dynamic packet loss events in video and audio transmission.Specifically,the proposed scheme consists of a hybrid Reed-Solomon and low-density parity-check(RS-LDPC)coding system,combined with a Kalman filter-based adaptive algorithm.The hybrid RS-LDPC coding accommodates a wide range of code length requirements,employing RS coding for short codes and LDPC coding for medium-long codes.We delimit the short and medium-length codes by coding performance so that both codes remain in the optimal region.Additionally,a Kalman filter-based adaptive algorithm has been developed to handle dynamic alterations in a packet loss rate.The Kalman filter estimates packet loss rate utilizing observation data and system models,and then we establish the redundancy decision module through receiver feedback.As a result,the lost packets can be perfectly recovered by the receiver based on the redundant packets.Experimental results show that the proposed method enhances the decoding performance significantly under the same redundancy and channel packet loss.展开更多
Video games have been around for several decades and have had many advancements from the original start of video games. Video games started as virtual games that were advertised towards children, and these virtual gam...Video games have been around for several decades and have had many advancements from the original start of video games. Video games started as virtual games that were advertised towards children, and these virtual games created a virtual reality of a variety of genres. These genres included sports games, such as tennis, football, baseball, war games, fantasy, puzzles, etc. The start of these games was derived from a sports genre and now has a popularity in multiplayer-online-shooting games. The purpose of this paper is to investigate different types of tools available for cheating in virtual world making players have undue advantage over other players in a competition. With the advancement in technology, these video games have become more expanded in the development aspects of gaming. Video game developers have created long lines of codes to create a new look of video games. As video games have progressed, the coding, bugs, bots, and errors of video games have changed throughout the years. The coding of video games has branched out from the original video games, which have given many benefits to this virtual world, while simultaneously creating more problems such as bots. Analysis of tools available for cheating in a game has disadvantaged normal gamer in a fair contest.展开更多
Multi-view video coding (MVC) comprises rich 3D information and is widely used in new visual media, such as 3DTV and free viewpoint TV (FTV). However, even with mainstream computer manufacturers migrating to multi...Multi-view video coding (MVC) comprises rich 3D information and is widely used in new visual media, such as 3DTV and free viewpoint TV (FTV). However, even with mainstream computer manufacturers migrating to multi-core processors, the huge computational requirement of MVC currently prohibits its wide use in consumer markets. In this paper, we demonstrate the design and implementation of the first parallel MVC system on Cell Broadband Engine^TM processor which is a state-of-the-art multi-core processor. We propose a task-dispatching algorithm which is adaptive data-driven on the frame level for MVC, and implement a parallel multi-view video decoder with modified H.264/AVC codec on real machine. This approach provides scalable speedup (up to 16 times on sixteen cores) through proper local store management, utilization of code locality and SIMD improvement. Decoding speed, speedup and utilization rate of cores are expressed in experimental results.展开更多
The rate and distortion of Id-slice do not fit the globally linear relationship on a logarithmic scale. Lagrange multiplier selection methods based on the globally linear approximate relationship are neither efficient...The rate and distortion of Id-slice do not fit the globally linear relationship on a logarithmic scale. Lagrange multiplier selection methods based on the globally linear approximate relationship are neither efficient nor optimal for multi-view video coding (MVC). To improve the coding efficiency of MVC, a local curve fitting based Lagrange multiplier selection method is proposed in this paper, where Lagrange multipliers are selected according to the local slopes of the approximate curves. Experi-mental results showed that the proposed method improves the coding efficiency. Up to 2.5 dB gain was achieved at low bitrates.展开更多
New video applications, such as 3D video and free viewpoint video, require efficient compression of multi-view video. In addition to temporal redundancy, exploiting the inter-view redundancy is crucial to improve the ...New video applications, such as 3D video and free viewpoint video, require efficient compression of multi-view video. In addition to temporal redundancy, exploiting the inter-view redundancy is crucial to improve the performance of multi-view video coding. In this paper, we present a novel method to construct the optimal inter-view prediction structure for multi-view video coding using simulated annealing. In the proposed model, the design of the prediction structure is converted to the arrangement of coding order. Then, a simulated annealing algorithm is employed to minimize the total cost for obtaining the best coding order. This method is applicable to arbitrary irregular camera arrangements. As experiment results reveal, the annealing process converges to satisfactory results rapidly and the generated optimal prediction structure outperforms the reference prediction structure of the joint multi-view video model (JMVM) by 0.1-0.8 dB PSNR gains.展开更多
Color inconsistency between views is an important problem to be solved in multi-view video applications, such as free viewpoint television and other three-dimensional video systems. In this paper, by combining with mu...Color inconsistency between views is an important problem to be solved in multi-view video applications, such as free viewpoint television and other three-dimensional video systems. In this paper, by combining with multi-view video coding, a coding-oriented multi-view video color correction method is proposed. We first separate foreground and background in first Group Of Pictures (GOP) by using SKIP coding mode. Then by transferring means and standard deviations in backgrounds, color correction is performed for each frame in GOP, and multi-view video coding is performed and used to renew the backgrounds. Experimental results ances in color correction and multi-view video show the proposed method can obtain better performcoding.展开更多
Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the...Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the technical and business issues surrounding cloudbased live streaming is provided,including the benefits of cloud computing,the various live streaming architectures,and the challenges that live streaming service providers face in delivering high‐quality,real‐time services.The different techniques used to improve the performance of video streaming,such as adaptive bit‐rate streaming,multicast distribution,and edge computing are discussed and the necessity of low‐latency and high‐quality video transmission in cloud‐based live streaming is underlined.Issues such as improving user experience and live streaming service performance using cutting‐edge technology,like artificial intelligence and machine learning are discussed.In addition,the legal and regulatory implications of cloud‐based live streaming,including issues with network neutrality,data privacy,and content moderation are addressed.The future of cloud computing for live streaming is examined in the section that follows,and it looks at the most likely new developments in terms of trends and technology.For technology vendors,live streaming service providers,and regulators,the findings have major policy‐relevant implications.Suggestions on how stakeholders should address these concerns and take advantage of the potential presented by this rapidly evolving sector,as well as insights into the key challenges and opportunities associated with cloud‐based live streaming are provided.展开更多
Distributed video coding (DVC) is a new video coding approach based on Wyner-Ziv theorem. The novel uplink-friendly DVC, which offers low-complexity, low-power consuming, and low-cost video encoding, has aroused mor...Distributed video coding (DVC) is a new video coding approach based on Wyner-Ziv theorem. The novel uplink-friendly DVC, which offers low-complexity, low-power consuming, and low-cost video encoding, has aroused more and more research interests. In this paper a new method based on multiple view geometry is presented for spatial side information generation of uncalibrated video sensor network. Trifocal tensor encapsulates all the geometric relations among three views that are independent of scene structure; it can be computed from image correspondences alone without requiring knowledge of the motion or calibration. Simulation results show that trifocal tensor-based spatial side information improves the rate-distortion performance over motion compensation based interpolation side information by a maximum gap of around 2dB. Then fusion merges the different side information (temporal and spatial) in order to improve the quality of the final one. Simulation results show that the rate-distortion gains about 0.4 dB.展开更多
Popular video coding standards like H.264 and MPEG working on the principle of motion-compensated pre-dictive coding demand much of the computational resources at the encoder increasing its complexity. Such bulky enco...Popular video coding standards like H.264 and MPEG working on the principle of motion-compensated pre-dictive coding demand much of the computational resources at the encoder increasing its complexity. Such bulky encoders are not suitable for applications like wireless low power surveillance, multimedia sensor networks, wireless PC cameras, mobile camera phones etc. New video coding scheme based on the principle of distributed source coding is looked upon in this paper. This scheme supports a low complexity encoder, at the same time trying to achieve the rate distortion performance of conventional video codecs. Current im-plementation uses LDPC codes for syndrome coding.展开更多
In order to decrease both computational complexity and coding time, an improved algorithm for the early detection of all-zero blocks (AZBs) in H. 264/AVC is proposed. The previous AZBs detection algorithms are revie...In order to decrease both computational complexity and coding time, an improved algorithm for the early detection of all-zero blocks (AZBs) in H. 264/AVC is proposed. The previous AZBs detection algorithms are reviewed. Three types of transformed frequency-domain coefficients, which are quantized to zeros, are analyzed. Based on the three types of frequencydomain scaling factors, the corresponding spatial coefficients are derived. Then the Schwarz inequality is applied to the derivation of the three thresholds based on spatial coefficients. Another threshold is set on the basis of the probability distribution of zero coefficients in a block. As a result, an adaptive AZBs detection algorithm is proposed based on the minimum of the former three thresholds and the threshold of zero blocks distribution. The simulation results show that, compared with the existing AZBs detection algorithms, the proposed algorithm achieves a 5% higher detection ratio in AZBs and 4% to 10% computation saving with only 0. 1 dB video quality degradation.展开更多
In order to achieve better perceptual coding quality while using fewer bits, a novel perceptual video coding method based on the just-noticeable-distortion (JND) model and the auto-regressive (AR) model is explore...In order to achieve better perceptual coding quality while using fewer bits, a novel perceptual video coding method based on the just-noticeable-distortion (JND) model and the auto-regressive (AR) model is explored. First, a new texture segmentation method exploiting the JND profile is devised to detect and classify texture regions in video scenes. In this step, a spatial-temporal JND model is proposed and the JND energy of every micro-block unit is computed and compared with the threshold. Secondly, in order to effectively remove temporal redundancies while preserving high visual quality, an AR model is applied to synthesize the texture regions. All the parameters of the AR model are obtained by the least-squares method and each pixel in the texture region is generated as a linear combination of pixels taken from the closest forward and backward reference frames. Finally, the proposed method is compared with the H.264/AVC video coding system to demonstrate the performance. Various sequences with different types of texture regions are used in the experiment and the results show that the proposed method can reduce the bit-rate by 15% to 58% while maintaining good perceptual quality.展开更多
We are interested in providing Video-on-Demand (VoD) streaming service to a large population of clients using peer-to-peer (P2P) approach. Given the asynchronous demands from multiple clients, continuously changing of...We are interested in providing Video-on-Demand (VoD) streaming service to a large population of clients using peer-to-peer (P2P) approach. Given the asynchronous demands from multiple clients, continuously changing of the buffered contents, and the continuous video display requirement, how to collaborate with potential partners to get expected data for future content delivery are very important and challenging. In this paper, we develop a novel scheduling algorithm based on deadline- aware network coding (DNC) to fully exploit the network resource for efficient VoD service. DNC generalizes the existing net- work coding (NC) paradigm, an elegant solution for ubiquitous data distribution. Yet, with deadline awareness, DNC improves the network throughput and meanwhile avoid missing the play deadline in high probability, which is a major deficiency of the con- ventional NC. Extensive simulation results demonstrated that DNC achieves high streaming continuity even in tight network conditions.展开更多
The scalable extension of H.264/AVC, known as scalable video coding or SVC, is currently the main focus of the Joint Video Team’s work. In its present working draft, the higher level syntax of SVC follows the design ...The scalable extension of H.264/AVC, known as scalable video coding or SVC, is currently the main focus of the Joint Video Team’s work. In its present working draft, the higher level syntax of SVC follows the design principles of H.264/AVC. Self-contained network abstraction layer units (NAL units) form natural entities for packetization. The SVC specification is by no means finalized yet, but nevertheless the work towards an optimized RTP payload format has already started. RFC 3984, the RTP payload specification for H.264/AVC has been taken as a starting point, but it became quickly clear that the scalable features of SVC require adaptation in at least the areas of capability/operation point signaling and documentation of the extended NAL unit header. This paper first gives an overview of the history of scalable video coding, and then reviews the video coding layer (VCL) and NAL of the latest SVC draft specification. Finally, it discusses different aspects of the draft SVC RTP payload format, in- cluding the design criteria, use cases, signaling and payload structure.展开更多
The authors propose a novel method for transporting multi-view videos that aims to keep the bandwidth requirements on both end-users and servers as low as possible. The method is based on application layer multicast, ...The authors propose a novel method for transporting multi-view videos that aims to keep the bandwidth requirements on both end-users and servers as low as possible. The method is based on application layer multicast, where each end point re- ceives only a selected number of views required for rendering video from its current viewpoint at any given time. The set of selected videos changes in real time as the user’s viewpoint changes because of head or eye movements. Techniques for reducing the black-outs during fast viewpoint changes were investigated. The performance of the approach was studied through network experiments.展开更多
AVS2 is a new generation video coding standard developed by the AVS working group. Compared with the first generation AVS video coding standard, known as AVS1, AVS2 significantly improves coding performance by using m...AVS2 is a new generation video coding standard developed by the AVS working group. Compared with the first generation AVS video coding standard, known as AVS1, AVS2 significantly improves coding performance by using many new coding technologies, e.g., adaptive block partition and two level transform coding. Moreover, for scene video, e.g. surveillance video and conference vid?eo, AVS2 provided a background picture modeling scheme to achieve more accurate prediction, which can also make object detec?tion and tracking in surveillance video coding more flexible. Experimental results show that AVS2 is competitive with High Effi?ciency Video Coding (HEVC) in terms of performance. Especially for scene video, AVS2 can achieve 39% bit rate saving over HEVC.展开更多
Most recently, due to the demand of immersive communication, region-of-interest-based(ROI) high efficiency video coding(HEVC) approaches in conferencing scenarios have become increasingly important. However, there exi...Most recently, due to the demand of immersive communication, region-of-interest-based(ROI) high efficiency video coding(HEVC) approaches in conferencing scenarios have become increasingly important. However, there exists no objective metric, specially developed for efficiently evaluating the perceived visual quality of video conferencing coding. Therefore, this paper proposes a novel objective quality assessment method, namely Gaussian mixture model based peak signal-tonoise ratio(GMM-PSNR), for the perceptual video conferencing coding. First, eye tracking experiments, together with a real-time technique of face and facial feature extraction, are introduced. In the experiments, importance of background, face, and facial feature regions is identified, and it is then quantified based on eye fixation points over test videos. Next, assuming that the distribution of the eye fixation points obeys Gaussian mixture model, we utilize expectation-maximization(EM) algorithm to generate an importance weight map for each frame of video conferencing coding, in light of a new term eye fixation points/pixel(efp/p). According to the generated weight map, GMM-PSNR is developed for quality assessment by assigning different weights to the distortion of each pixel in the video frame. Finally, we utilize some experiments to investigate the correlation of the proposed GMM-PSNR and other conventional objective metrics with subjective quality metrics. The experimental results show the effectiveness of GMM-PSNR.展开更多
基金Project supported by the National Natural Science Foundation of China (No. 60772134)the Innovation Foundation of Xidian University,China (No. Chuang 05018)
文摘A novel color compensation method for multi-view video coding (MVC) is proposed, which efficiently exploits the inter-view dependencies between views with the existence of color mismatch caused by the diversity of cameras. A color compensation model is developed in RGB channels and then extended to YCbCr channels for practical use. A modified inter-view reference picture is constructed based on the color compensation model, which is more similar to the coding picture than the original inter-view reference picture. Moreover, the color compensation factors can be derived in both encoder and decoder, therefore no additional data need to be transmitted to the decoder. The experimental results show that the proposed method improves the coding efficiency of MVC and maintains good subjective quality.
基金supported by the National Natural Science Foundation of China (Grant Nos.60832003,60672052,60902085,60972137)the Key Project of Shanghai Municipal Education Commission (Grant No.09ZZ90)+2 种基金the Natural Science Foundation of Shanghai(Grant No.09ZR1412500)the Innovation Foundation of Shanghai University (Grants Nos.10YZ09,SHUCX091061)the Shuguang Plan of Shanghai Education Development Foundation (Grant No.06SG43)
文摘Current multi-view video coding (MVC) reference model in joint video team (JVT) does not provide efficient rate control schemes. This paper presents a rate control algorithm for MVC by improving the quadratic rate-distortion (R-D) model. We reasonably allocate bit-rate among views based on the correlation analysisl The proposed algorithm consists of three levels to control the rate bits more accurately, of which the frame layer allocates bits according to the frame complexity and the temporal activity. Extensive experiments show that the proposed algorithm can control the bit rate efficiently.
基金Project(08Y29-7)supported by the Transportation Science and Research Program of Jiangsu Province,ChinaProject(201103051)supported by the Major Infrastructure Program of the Health Monitoring System Hardware Platform Based on Sensor Network Node,China+1 种基金Project(61100111)supported by the National Natural Science Foundation of ChinaProject(BE2011169)supported by the Scientific and Technical Supporting Program of Jiangsu Province,China
文摘The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.
基金supported by ZTE Industry-University-Institute Cooperation Funds.
文摘To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.
文摘This paper proposes an adaptive hybrid forward error correction(AH-FEC)coding scheme for coping with dynamic packet loss events in video and audio transmission.Specifically,the proposed scheme consists of a hybrid Reed-Solomon and low-density parity-check(RS-LDPC)coding system,combined with a Kalman filter-based adaptive algorithm.The hybrid RS-LDPC coding accommodates a wide range of code length requirements,employing RS coding for short codes and LDPC coding for medium-long codes.We delimit the short and medium-length codes by coding performance so that both codes remain in the optimal region.Additionally,a Kalman filter-based adaptive algorithm has been developed to handle dynamic alterations in a packet loss rate.The Kalman filter estimates packet loss rate utilizing observation data and system models,and then we establish the redundancy decision module through receiver feedback.As a result,the lost packets can be perfectly recovered by the receiver based on the redundant packets.Experimental results show that the proposed method enhances the decoding performance significantly under the same redundancy and channel packet loss.
文摘Video games have been around for several decades and have had many advancements from the original start of video games. Video games started as virtual games that were advertised towards children, and these virtual games created a virtual reality of a variety of genres. These genres included sports games, such as tennis, football, baseball, war games, fantasy, puzzles, etc. The start of these games was derived from a sports genre and now has a popularity in multiplayer-online-shooting games. The purpose of this paper is to investigate different types of tools available for cheating in virtual world making players have undue advantage over other players in a competition. With the advancement in technology, these video games have become more expanded in the development aspects of gaming. Video game developers have created long lines of codes to create a new look of video games. As video games have progressed, the coding, bugs, bots, and errors of video games have changed throughout the years. The coding of video games has branched out from the original video games, which have given many benefits to this virtual world, while simultaneously creating more problems such as bots. Analysis of tools available for cheating in a game has disadvantaged normal gamer in a fair contest.
基金Supported partially by the National Natural Science Foundation of China (Grant No.60503063)the National High-Tech Research & Development Program of China (Grant No.2006AA01Z321)the National Basic Research Program of China (Grant No.2006CB303103)
文摘Multi-view video coding (MVC) comprises rich 3D information and is widely used in new visual media, such as 3DTV and free viewpoint TV (FTV). However, even with mainstream computer manufacturers migrating to multi-core processors, the huge computational requirement of MVC currently prohibits its wide use in consumer markets. In this paper, we demonstrate the design and implementation of the first parallel MVC system on Cell Broadband Engine^TM processor which is a state-of-the-art multi-core processor. We propose a task-dispatching algorithm which is adaptive data-driven on the frame level for MVC, and implement a parallel multi-view video decoder with modified H.264/AVC codec on real machine. This approach provides scalable speedup (up to 16 times on sixteen cores) through proper local store management, utilization of code locality and SIMD improvement. Decoding speed, speedup and utilization rate of cores are expressed in experimental results.
基金Project (Nos. 60505017 and 60534070) supported by the National Natural Science Foundation of China
文摘The rate and distortion of Id-slice do not fit the globally linear relationship on a logarithmic scale. Lagrange multiplier selection methods based on the globally linear approximate relationship are neither efficient nor optimal for multi-view video coding (MVC). To improve the coding efficiency of MVC, a local curve fitting based Lagrange multiplier selection method is proposed in this paper, where Lagrange multipliers are selected according to the local slopes of the approximate curves. Experi-mental results showed that the proposed method improves the coding efficiency. Up to 2.5 dB gain was achieved at low bitrates.
基金Project supported by the National Natural Science Foundation of China (No. 60802013)the Zhejiang Provincial Natural Science Foundation of China (No. Y106574)
文摘New video applications, such as 3D video and free viewpoint video, require efficient compression of multi-view video. In addition to temporal redundancy, exploiting the inter-view redundancy is crucial to improve the performance of multi-view video coding. In this paper, we present a novel method to construct the optimal inter-view prediction structure for multi-view video coding using simulated annealing. In the proposed model, the design of the prediction structure is converted to the arrangement of coding order. Then, a simulated annealing algorithm is employed to minimize the total cost for obtaining the best coding order. This method is applicable to arbitrary irregular camera arrangements. As experiment results reveal, the annealing process converges to satisfactory results rapidly and the generated optimal prediction structure outperforms the reference prediction structure of the joint multi-view video model (JMVM) by 0.1-0.8 dB PSNR gains.
基金the National Natural Science Foundation of China (No.60672073, No.60872094)the Program for New Century Excellent Talents in University (NCET-06-0537)+2 种基金the Key Project of Chinese Ministry of Education (No. 206059)Scientific Research Fund of Zhejiang Provincial Education Department (No.20070962)the Natural Science Foundation of Ningbo (No.2008A610016).
文摘Color inconsistency between views is an important problem to be solved in multi-view video applications, such as free viewpoint television and other three-dimensional video systems. In this paper, by combining with multi-view video coding, a coding-oriented multi-view video color correction method is proposed. We first separate foreground and background in first Group Of Pictures (GOP) by using SKIP coding mode. Then by transferring means and standard deviations in backgrounds, color correction is performed for each frame in GOP, and multi-view video coding is performed and used to renew the backgrounds. Experimental results ances in color correction and multi-view video show the proposed method can obtain better performcoding.
文摘Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the technical and business issues surrounding cloudbased live streaming is provided,including the benefits of cloud computing,the various live streaming architectures,and the challenges that live streaming service providers face in delivering high‐quality,real‐time services.The different techniques used to improve the performance of video streaming,such as adaptive bit‐rate streaming,multicast distribution,and edge computing are discussed and the necessity of low‐latency and high‐quality video transmission in cloud‐based live streaming is underlined.Issues such as improving user experience and live streaming service performance using cutting‐edge technology,like artificial intelligence and machine learning are discussed.In addition,the legal and regulatory implications of cloud‐based live streaming,including issues with network neutrality,data privacy,and content moderation are addressed.The future of cloud computing for live streaming is examined in the section that follows,and it looks at the most likely new developments in terms of trends and technology.For technology vendors,live streaming service providers,and regulators,the findings have major policy‐relevant implications.Suggestions on how stakeholders should address these concerns and take advantage of the potential presented by this rapidly evolving sector,as well as insights into the key challenges and opportunities associated with cloud‐based live streaming are provided.
文摘Distributed video coding (DVC) is a new video coding approach based on Wyner-Ziv theorem. The novel uplink-friendly DVC, which offers low-complexity, low-power consuming, and low-cost video encoding, has aroused more and more research interests. In this paper a new method based on multiple view geometry is presented for spatial side information generation of uncalibrated video sensor network. Trifocal tensor encapsulates all the geometric relations among three views that are independent of scene structure; it can be computed from image correspondences alone without requiring knowledge of the motion or calibration. Simulation results show that trifocal tensor-based spatial side information improves the rate-distortion performance over motion compensation based interpolation side information by a maximum gap of around 2dB. Then fusion merges the different side information (temporal and spatial) in order to improve the quality of the final one. Simulation results show that the rate-distortion gains about 0.4 dB.
文摘Popular video coding standards like H.264 and MPEG working on the principle of motion-compensated pre-dictive coding demand much of the computational resources at the encoder increasing its complexity. Such bulky encoders are not suitable for applications like wireless low power surveillance, multimedia sensor networks, wireless PC cameras, mobile camera phones etc. New video coding scheme based on the principle of distributed source coding is looked upon in this paper. This scheme supports a low complexity encoder, at the same time trying to achieve the rate distortion performance of conventional video codecs. Current im-plementation uses LDPC codes for syndrome coding.
基金The EU Seventh Framework Programme FP7-PEOPLE-IRSES( No. 247083)
文摘In order to decrease both computational complexity and coding time, an improved algorithm for the early detection of all-zero blocks (AZBs) in H. 264/AVC is proposed. The previous AZBs detection algorithms are reviewed. Three types of transformed frequency-domain coefficients, which are quantized to zeros, are analyzed. Based on the three types of frequencydomain scaling factors, the corresponding spatial coefficients are derived. Then the Schwarz inequality is applied to the derivation of the three thresholds based on spatial coefficients. Another threshold is set on the basis of the probability distribution of zero coefficients in a block. As a result, an adaptive AZBs detection algorithm is proposed based on the minimum of the former three thresholds and the threshold of zero blocks distribution. The simulation results show that, compared with the existing AZBs detection algorithms, the proposed algorithm achieves a 5% higher detection ratio in AZBs and 4% to 10% computation saving with only 0. 1 dB video quality degradation.
基金The National Natural Science Foundation of China (No.60472058, 60975017)
文摘In order to achieve better perceptual coding quality while using fewer bits, a novel perceptual video coding method based on the just-noticeable-distortion (JND) model and the auto-regressive (AR) model is explored. First, a new texture segmentation method exploiting the JND profile is devised to detect and classify texture regions in video scenes. In this step, a spatial-temporal JND model is proposed and the JND energy of every micro-block unit is computed and compared with the threshold. Secondly, in order to effectively remove temporal redundancies while preserving high visual quality, an AR model is applied to synthesize the texture regions. All the parameters of the AR model are obtained by the least-squares method and each pixel in the texture region is generated as a linear combination of pixels taken from the closest forward and backward reference frames. Finally, the proposed method is compared with the H.264/AVC video coding system to demonstrate the performance. Various sequences with different types of texture regions are used in the experiment and the results show that the proposed method can reduce the bit-rate by 15% to 58% while maintaining good perceptual quality.
基金Project (No. DAG05/06.EG05) supported by the Research GrantCouncil (RGC) of Hong Kong, China
文摘We are interested in providing Video-on-Demand (VoD) streaming service to a large population of clients using peer-to-peer (P2P) approach. Given the asynchronous demands from multiple clients, continuously changing of the buffered contents, and the continuous video display requirement, how to collaborate with potential partners to get expected data for future content delivery are very important and challenging. In this paper, we develop a novel scheduling algorithm based on deadline- aware network coding (DNC) to fully exploit the network resource for efficient VoD service. DNC generalizes the existing net- work coding (NC) paradigm, an elegant solution for ubiquitous data distribution. Yet, with deadline awareness, DNC improves the network throughput and meanwhile avoid missing the play deadline in high probability, which is a major deficiency of the con- ventional NC. Extensive simulation results demonstrated that DNC achieves high streaming continuity even in tight network conditions.
文摘The scalable extension of H.264/AVC, known as scalable video coding or SVC, is currently the main focus of the Joint Video Team’s work. In its present working draft, the higher level syntax of SVC follows the design principles of H.264/AVC. Self-contained network abstraction layer units (NAL units) form natural entities for packetization. The SVC specification is by no means finalized yet, but nevertheless the work towards an optimized RTP payload format has already started. RFC 3984, the RTP payload specification for H.264/AVC has been taken as a starting point, but it became quickly clear that the scalable features of SVC require adaptation in at least the areas of capability/operation point signaling and documentation of the extended NAL unit header. This paper first gives an overview of the history of scalable video coding, and then reviews the video coding layer (VCL) and NAL of the latest SVC draft specification. Finally, it discusses different aspects of the draft SVC RTP payload format, in- cluding the design criteria, use cases, signaling and payload structure.
基金Project (No. 511568) supported by the European Commissionwithin Framework Program 6 with the acronym 3DTV
文摘The authors propose a novel method for transporting multi-view videos that aims to keep the bandwidth requirements on both end-users and servers as low as possible. The method is based on application layer multicast, where each end point re- ceives only a selected number of views required for rendering video from its current viewpoint at any given time. The set of selected videos changes in real time as the user’s viewpoint changes because of head or eye movements. Techniques for reducing the black-outs during fast viewpoint changes were investigated. The performance of the approach was studied through network experiments.
文摘AVS2 is a new generation video coding standard developed by the AVS working group. Compared with the first generation AVS video coding standard, known as AVS1, AVS2 significantly improves coding performance by using many new coding technologies, e.g., adaptive block partition and two level transform coding. Moreover, for scene video, e.g. surveillance video and conference vid?eo, AVS2 provided a background picture modeling scheme to achieve more accurate prediction, which can also make object detec?tion and tracking in surveillance video coding more flexible. Experimental results show that AVS2 is competitive with High Effi?ciency Video Coding (HEVC) in terms of performance. Especially for scene video, AVS2 can achieve 39% bit rate saving over HEVC.
文摘Most recently, due to the demand of immersive communication, region-of-interest-based(ROI) high efficiency video coding(HEVC) approaches in conferencing scenarios have become increasingly important. However, there exists no objective metric, specially developed for efficiently evaluating the perceived visual quality of video conferencing coding. Therefore, this paper proposes a novel objective quality assessment method, namely Gaussian mixture model based peak signal-tonoise ratio(GMM-PSNR), for the perceptual video conferencing coding. First, eye tracking experiments, together with a real-time technique of face and facial feature extraction, are introduced. In the experiments, importance of background, face, and facial feature regions is identified, and it is then quantified based on eye fixation points over test videos. Next, assuming that the distribution of the eye fixation points obeys Gaussian mixture model, we utilize expectation-maximization(EM) algorithm to generate an importance weight map for each frame of video conferencing coding, in light of a new term eye fixation points/pixel(efp/p). According to the generated weight map, GMM-PSNR is developed for quality assessment by assigning different weights to the distortion of each pixel in the video frame. Finally, we utilize some experiments to investigate the correlation of the proposed GMM-PSNR and other conventional objective metrics with subjective quality metrics. The experimental results show the effectiveness of GMM-PSNR.