The transmission of video content over a network raises various issues relating to copyright authenticity,ethics,legality,and privacy.The protection of copyrighted video content is a significant issue in the video ind...The transmission of video content over a network raises various issues relating to copyright authenticity,ethics,legality,and privacy.The protection of copyrighted video content is a significant issue in the video industry,and it is essential to find effective solutions to prevent tampering and modification of digital video content during its transmission through digital media.However,there are stillmany unresolved challenges.This paper aims to address those challenges by proposing a new technique for detectingmoving objects in digital videos,which can help prove the credibility of video content by detecting any fake objects inserted by hackers.The proposed technique involves using two methods,the H.264 and the extraction color features methods,to embed and extract watermarks in video frames.The study tested the performance of the system against various attacks and found it to be robust.The evaluation was done using different metrics such as Peak-Signal-to-Noise Ratio(PSNR),Mean Squared Error(MSE),Structural Similarity Index Measure(SSIM),Bit Correction Ratio(BCR),and Normalized Correlation.The accuracy of identifying moving objects was high,ranging from 96.3%to 98.7%.The system was also able to embed a fragile watermark with a success rate of over 93.65%and had an average capacity of hiding of 78.67.The reconstructed video frames had high quality with a PSNR of at least 65.45 dB and SSIMof over 0.97,making them imperceptible to the human eye.The system also had an acceptable average time difference(T=1.227/s)compared with other state-of-the-art methods.展开更多
The scalable extension of H.264/AVC, known as scalable video coding or SVC, is currently the main focus of the Joint Video Team’s work. In its present working draft, the higher level syntax of SVC follows the design ...The scalable extension of H.264/AVC, known as scalable video coding or SVC, is currently the main focus of the Joint Video Team’s work. In its present working draft, the higher level syntax of SVC follows the design principles of H.264/AVC. Self-contained network abstraction layer units (NAL units) form natural entities for packetization. The SVC specification is by no means finalized yet, but nevertheless the work towards an optimized RTP payload format has already started. RFC 3984, the RTP payload specification for H.264/AVC has been taken as a starting point, but it became quickly clear that the scalable features of SVC require adaptation in at least the areas of capability/operation point signaling and documentation of the extended NAL unit header. This paper first gives an overview of the history of scalable video coding, and then reviews the video coding layer (VCL) and NAL of the latest SVC draft specification. Finally, it discusses different aspects of the draft SVC RTP payload format, in- cluding the design criteria, use cases, signaling and payload structure.展开更多
The dilemma of the quantization parameter (QP) being involved in both rate control and rate-distortion optimization (RDO) prevents using the traditional rate control scheme. Although some rate control schemes are prop...The dilemma of the quantization parameter (QP) being involved in both rate control and rate-distortion optimization (RDO) prevents using the traditional rate control scheme. Although some rate control schemes are proposed to circumvent the dilemma, the inaccurate prediction model and improper bit allocation deter H.264 application on low bandwidth channel. To resolve this issue, this paper proposes a novel rate control scheme by considering the macroblock (MB) encoding complexity variation and buffer variation and by exploiting the spatio-temporal correlation sufficiently well. Simulations showed that this scheme improves the perceptual quality of the pictures with similar or smaller PSNR deviations when compared to that of rate control in JVT-O016.展开更多
In this work, we present an evaluation of the performance and error robustness of RTP-based broadcast streaming of high-quality high-definition (HD) H.264/AVC video. Using a fully controlled IP test bed (Hillestad et ...In this work, we present an evaluation of the performance and error robustness of RTP-based broadcast streaming of high-quality high-definition (HD) H.264/AVC video. Using a fully controlled IP test bed (Hillestad et al., 2005), we broadcast high-definition video over RTP/UDP, and use an IP network emulator to introduce a varying amount of randomly distributed packet loss. A high-performance network interface monitoring card is used to capture the video packets into a trace file. Purpose-built software parses the trace file, analyzes the RTP stream and assembles the correctly received NAL units into an H.264/AVC Annex B byte stream file, which is subsequently decoded by JVT JM 10.1 reference software. The proposed measurement setup is a novel, practical and intuitive approach to perform error resilience testing of real-world H.264/AVC broadcast applications. Through a series of experiments, we evaluate some of the error resilience features of the H.264/AVC standard, and see how they perform at packet loss rates from 0.01% to 5%. The results confirmed that an appropriate slice partitioning scheme is essential to have a graceful degradation in received quality in the case of packet loss. While flexible macroblock ordering reduces the compression efficiency about 1 dB for our test material, reconstructed video quality is improved for loss rates above 0.25%.展开更多
This letter proposes a rate control algorithm for H.264 video encoder, which is based on block activity and buffer state. Experimental results indicate that it has an excellent performance by providing much accurate b...This letter proposes a rate control algorithm for H.264 video encoder, which is based on block activity and buffer state. Experimental results indicate that it has an excellent performance by providing much accurate bit rate and better coding efficiency compared with H.264. The computational complexity of the algorithm is reduced by adopting a novel block activity description method using the Sum of Absolute Difference (SAD) of 16× 16 mode, and its robustness is enhanced by introducing a feedback circuit at frame layer.展开更多
A semi-fragile content authentication algorithm is proposed for low bit-rate H.264/AVC video in VLC domain. Utilizing the intra prediction mode and coded block pattern in VLC domain, the proposed algorithm chooses tho...A semi-fragile content authentication algorithm is proposed for low bit-rate H.264/AVC video in VLC domain. Utilizing the intra prediction mode and coded block pattern in VLC domain, the proposed algorithm chooses those macro-blocks from which the signature is extracted and constructs content signature at macro-block level according to the relationship among the energies of quantized low-frequency coefficients of sub-macroblocks. The signature is embedded by modifying the trailing coefficients. The experimental results show that the proposed algorithm performs well in visual quality impact and keep the bit-rate basically unchanged. In addition, the algorithm can embed signatures into I, P, B slices simultaneously and remarkably enhances the watermark capacity. By verifying the extracted signature, the algorithm can detect and locate video tampering efficiently.展开更多
The study applied a charge-coupled device (CCD) camera to send video signals to 4 DaVinci<sup>TM</sup> development boards (TMS320DM6446) of Texas Instruments (TI) to carry out H.264 Baseline Profile video ...The study applied a charge-coupled device (CCD) camera to send video signals to 4 DaVinci<sup>TM</sup> development boards (TMS320DM6446) of Texas Instruments (TI) to carry out H.264 Baseline Profile video coding. One of the development boards coded in the Variable Bit Rate (VBR) mode, and the other three development boards coded in the Constant Bit Rate (CBR) mode. In addition, the constant rates are 2 Mbps, 1.5 Mbps and 1 Mbps respectively. The H.264 video compression files produced by the boards were analyzed via video analysis software (CodecVisa) in the study. This software can analyze and present the compression data characteristics of the video files under each video frame, i.e., bits/MB, QP, and PSNR. In this research, the characteristics of data of each frame under four different compression conditions were compared. Their differences were calculated and averaged, and the standard deviation was evaluated. It was further connected with the values of quality characteristics and the peak signal to noise ratio (PSNR) of each frame to analyze the relation among the frame quality, the compression rate of CBR, as well as the quantitative granularity. The preliminary conclusion of the study is that the compression behaviors of CBRs in different coding sources are adjusted in a specific proportion in order to cope with the change in frame complexity. The frame will be severely damaged by a critical value during the process of network transmission while the source rate is less than the value of the characteristic.展开更多
In this paper, we present a spatio-temporal post-processing error concealment (EC) algorithm designed initially for a H.264 video-streaming scheme over packet-lossy networks. It aims at optimizing the subjective quali...In this paper, we present a spatio-temporal post-processing error concealment (EC) algorithm designed initially for a H.264 video-streaming scheme over packet-lossy networks. It aims at optimizing the subjective quality of the restored video under the constraints of low delay and computational complexity, which are critical to real-time applications and portable devices having limited resources. Specifically, it takes into consideration the physical property of motion field in order to achieve more meaningful perceptual video quality, in addition to the improved objective PSNR. Further, a simple bilinear spatial interpolation approach is combined with the improved boundary-match (B-M) based temporal EC approach according to texture and motion activity analysis. Finally, we propose a low complexity temporal EC method based on motion vector interpolation as a replacement of the B-M based approach in the scheme under low-computation requirement, or as a complement to further improve the scheme's performance in applications having enough computation resources. Extensive experiments demonstrated that the proposal features not only better reconstruction, objectively and subjectively, than JM benchmark, but also robustness to different video sequences.展开更多
The emergence of third generation mobile system (3G) makes video transmission in wireless environment possible, and the latest 3GPP/3GPP2 standards require 3G terminals support H.264/AVC. Due to high packet loss rate ...The emergence of third generation mobile system (3G) makes video transmission in wireless environment possible, and the latest 3GPP/3GPP2 standards require 3G terminals support H.264/AVC. Due to high packet loss rate in wireless envi- ronment, error resilience for 3G terminals is necessary. Moreover, because of the hardware restrictions, 3G mobile terminals support only part of H.264/AVC error resilience tool. This paper analyzes various error resilience tools and their functions, and presents 2 error resilience strategies for 3G mobile streaming video services and mobile conversational services. Performances of the proposed error resilience strategies were tested using off-line common test conditions. Experiments showed that the proposed error resilience strategies can yield reasonably satisfactory results.展开更多
This paper presents a reversible data hiding(RDH)method,which is designed by combining histogram modification(HM)with run-level coding in H.264/advanced video coding(AVC).In this scheme,the run-level is changed for em...This paper presents a reversible data hiding(RDH)method,which is designed by combining histogram modification(HM)with run-level coding in H.264/advanced video coding(AVC).In this scheme,the run-level is changed for embedding data into H.264/AVC video sequences.In order to guarantee the reversibility of the proposed scheme,the last nonzero quantized discrete cosine transform(DCT)coefficients in embeddable 4×4 blocks are shifted by the technology of histogram modification.The proposed scheme is realized after quantization and before entropy coding of H.264/AVC compression standard.Therefore,the embedded information can be correctly extracted at the decoding side.Peak-signal-noise-to-ratio(PSNR)and Structure similarity index(SSIM),embedding payload and bit-rate variation are exploited to measure the performance of the proposed scheme.Experimental results have shown that the proposed scheme leads to less SSIM variation and bit-rate increase.展开更多
In this paper, we propose a new method for very low bit-rate video coding that combines H.264/AVC standard and two-dimensional discrete wavelet transform. In this method, first a two dimensional wavelet transform is a...In this paper, we propose a new method for very low bit-rate video coding that combines H.264/AVC standard and two-dimensional discrete wavelet transform. In this method, first a two dimensional wavelet transform is applied on each video frame independently to extract the low frequency components for each frame and then the low frequency parts of all frames are coded using H.264/AVC codec. On the other hand, the high frequency parts of the video frames are coded by Run Length Coding algorithm, after applying a threshold to neglect the low value coefficients. Experiments show that our proposed method can achieve better rate-distortion performance at very low bit-rate applications below 16 kbits/s compared to applying H.264/AVC standard directly to all frames. Applications of our proposed video coding technique include video telephony, video-conferencing, transmitting or receiving video over half-rate traffic channels of GSM networks.展开更多
In this article, a spatio-temporal post-processing error concealment algorithm designed initially for a H.264 video-streaming scheme over packet-lossy networks has been presented. It aims at optimizing subjective qual...In this article, a spatio-temporal post-processing error concealment algorithm designed initially for a H.264 video-streaming scheme over packet-lossy networks has been presented. It aims at optimizing subjective quality of restored video and the conventional objective metric, peak signal-to-noise ratio (PSNR), as well, under the constraints of low delay and computational complexity, which are critical to real-time applications and portable devices having limited resources. Specifically, it takes into consideration physical property of motion to achieve more meaningful perceptual video quality. Further, a content-adaptive bilinear spatial interpolation approach and a temporal error concealment approach are combined under a unified boundary match criterion based on texture and motion activity analysis. Extensive experiments have demonstrated that the proposal not only result in better reconstruction, objectively and subjectively, than the reference software model benchmark, but also results in better robustness to different video sequences.展开更多
文摘The transmission of video content over a network raises various issues relating to copyright authenticity,ethics,legality,and privacy.The protection of copyrighted video content is a significant issue in the video industry,and it is essential to find effective solutions to prevent tampering and modification of digital video content during its transmission through digital media.However,there are stillmany unresolved challenges.This paper aims to address those challenges by proposing a new technique for detectingmoving objects in digital videos,which can help prove the credibility of video content by detecting any fake objects inserted by hackers.The proposed technique involves using two methods,the H.264 and the extraction color features methods,to embed and extract watermarks in video frames.The study tested the performance of the system against various attacks and found it to be robust.The evaluation was done using different metrics such as Peak-Signal-to-Noise Ratio(PSNR),Mean Squared Error(MSE),Structural Similarity Index Measure(SSIM),Bit Correction Ratio(BCR),and Normalized Correlation.The accuracy of identifying moving objects was high,ranging from 96.3%to 98.7%.The system was also able to embed a fragile watermark with a success rate of over 93.65%and had an average capacity of hiding of 78.67.The reconstructed video frames had high quality with a PSNR of at least 65.45 dB and SSIMof over 0.97,making them imperceptible to the human eye.The system also had an acceptable average time difference(T=1.227/s)compared with other state-of-the-art methods.
文摘The scalable extension of H.264/AVC, known as scalable video coding or SVC, is currently the main focus of the Joint Video Team’s work. In its present working draft, the higher level syntax of SVC follows the design principles of H.264/AVC. Self-contained network abstraction layer units (NAL units) form natural entities for packetization. The SVC specification is by no means finalized yet, but nevertheless the work towards an optimized RTP payload format has already started. RFC 3984, the RTP payload specification for H.264/AVC has been taken as a starting point, but it became quickly clear that the scalable features of SVC require adaptation in at least the areas of capability/operation point signaling and documentation of the extended NAL unit header. This paper first gives an overview of the history of scalable video coding, and then reviews the video coding layer (VCL) and NAL of the latest SVC draft specification. Finally, it discusses different aspects of the draft SVC RTP payload format, in- cluding the design criteria, use cases, signaling and payload structure.
文摘The dilemma of the quantization parameter (QP) being involved in both rate control and rate-distortion optimization (RDO) prevents using the traditional rate control scheme. Although some rate control schemes are proposed to circumvent the dilemma, the inaccurate prediction model and improper bit allocation deter H.264 application on low bandwidth channel. To resolve this issue, this paper proposes a novel rate control scheme by considering the macroblock (MB) encoding complexity variation and buffer variation and by exploiting the spatio-temporal correlation sufficiently well. Simulations showed that this scheme improves the perceptual quality of the pictures with similar or smaller PSNR deviations when compared to that of rate control in JVT-O016.
基金Project supported by the Research Council of Norway, Norwegian University of Science and Technology (NTNU), and the Norwegian Resarch Network (UNINETT)
文摘In this work, we present an evaluation of the performance and error robustness of RTP-based broadcast streaming of high-quality high-definition (HD) H.264/AVC video. Using a fully controlled IP test bed (Hillestad et al., 2005), we broadcast high-definition video over RTP/UDP, and use an IP network emulator to introduce a varying amount of randomly distributed packet loss. A high-performance network interface monitoring card is used to capture the video packets into a trace file. Purpose-built software parses the trace file, analyzes the RTP stream and assembles the correctly received NAL units into an H.264/AVC Annex B byte stream file, which is subsequently decoded by JVT JM 10.1 reference software. The proposed measurement setup is a novel, practical and intuitive approach to perform error resilience testing of real-world H.264/AVC broadcast applications. Through a series of experiments, we evaluate some of the error resilience features of the H.264/AVC standard, and see how they perform at packet loss rates from 0.01% to 5%. The results confirmed that an appropriate slice partitioning scheme is essential to have a graceful degradation in received quality in the case of packet loss. While flexible macroblock ordering reduces the compression efficiency about 1 dB for our test material, reconstructed video quality is improved for loss rates above 0.25%.
基金the National Nature Science Foundation of China(No.90104013) 863 Project(No.2002AA119010, 2001AA121061 and 2002AA123041)
文摘This letter proposes a rate control algorithm for H.264 video encoder, which is based on block activity and buffer state. Experimental results indicate that it has an excellent performance by providing much accurate bit rate and better coding efficiency compared with H.264. The computational complexity of the algorithm is reduced by adopting a novel block activity description method using the Sum of Absolute Difference (SAD) of 16× 16 mode, and its robustness is enhanced by introducing a feedback circuit at frame layer.
基金This paper is sponsored by the National Natural Science Foundation of China (No. 60802057, 61071153), National 863 Plan of China ( 2009AA01Z407 ), Shanghai Rising-Star Program (10QA1403700), and Shanghai Educational Development Foundation.
文摘A semi-fragile content authentication algorithm is proposed for low bit-rate H.264/AVC video in VLC domain. Utilizing the intra prediction mode and coded block pattern in VLC domain, the proposed algorithm chooses those macro-blocks from which the signature is extracted and constructs content signature at macro-block level according to the relationship among the energies of quantized low-frequency coefficients of sub-macroblocks. The signature is embedded by modifying the trailing coefficients. The experimental results show that the proposed algorithm performs well in visual quality impact and keep the bit-rate basically unchanged. In addition, the algorithm can embed signatures into I, P, B slices simultaneously and remarkably enhances the watermark capacity. By verifying the extracted signature, the algorithm can detect and locate video tampering efficiently.
文摘The study applied a charge-coupled device (CCD) camera to send video signals to 4 DaVinci<sup>TM</sup> development boards (TMS320DM6446) of Texas Instruments (TI) to carry out H.264 Baseline Profile video coding. One of the development boards coded in the Variable Bit Rate (VBR) mode, and the other three development boards coded in the Constant Bit Rate (CBR) mode. In addition, the constant rates are 2 Mbps, 1.5 Mbps and 1 Mbps respectively. The H.264 video compression files produced by the boards were analyzed via video analysis software (CodecVisa) in the study. This software can analyze and present the compression data characteristics of the video files under each video frame, i.e., bits/MB, QP, and PSNR. In this research, the characteristics of data of each frame under four different compression conditions were compared. Their differences were calculated and averaged, and the standard deviation was evaluated. It was further connected with the values of quality characteristics and the peak signal to noise ratio (PSNR) of each frame to analyze the relation among the frame quality, the compression rate of CBR, as well as the quantitative granularity. The preliminary conclusion of the study is that the compression behaviors of CBRs in different coding sources are adjusted in a specific proportion in order to cope with the change in frame complexity. The frame will be severely damaged by a critical value during the process of network transmission while the source rate is less than the value of the characteristic.
基金Project supported by the Teaching and Research Award Program for Outstanding Young Professor in High Education Institute, Ministration of Education, China
文摘In this paper, we present a spatio-temporal post-processing error concealment (EC) algorithm designed initially for a H.264 video-streaming scheme over packet-lossy networks. It aims at optimizing the subjective quality of the restored video under the constraints of low delay and computational complexity, which are critical to real-time applications and portable devices having limited resources. Specifically, it takes into consideration the physical property of motion field in order to achieve more meaningful perceptual video quality, in addition to the improved objective PSNR. Further, a simple bilinear spatial interpolation approach is combined with the improved boundary-match (B-M) based temporal EC approach according to texture and motion activity analysis. Finally, we propose a low complexity temporal EC method based on motion vector interpolation as a replacement of the B-M based approach in the scheme under low-computation requirement, or as a complement to further improve the scheme's performance in applications having enough computation resources. Extensive experiments demonstrated that the proposal features not only better reconstruction, objectively and subjectively, than JM benchmark, but also robustness to different video sequences.
基金Project supported by the National Natural Science Foundation of China (Nos. 60473106 and 60333010), China Ministry of Education(No. 20030335064), and China Ministry of Science and Technology(No. 2003AA4Z1020)
文摘The emergence of third generation mobile system (3G) makes video transmission in wireless environment possible, and the latest 3GPP/3GPP2 standards require 3G terminals support H.264/AVC. Due to high packet loss rate in wireless envi- ronment, error resilience for 3G terminals is necessary. Moreover, because of the hardware restrictions, 3G mobile terminals support only part of H.264/AVC error resilience tool. This paper analyzes various error resilience tools and their functions, and presents 2 error resilience strategies for 3G mobile streaming video services and mobile conversational services. Performances of the proposed error resilience strategies were tested using off-line common test conditions. Experiments showed that the proposed error resilience strategies can yield reasonably satisfactory results.
基金This work was supported by the National Natural Science Foundation of China(NSFC)under the grant No.61972269the Fundamental Research Funds for the Central Universities under the grant No.YJ201881Doctoral Innovation Fund Program of Southwest Jiaotong University under the grant No.DCX201824.
文摘This paper presents a reversible data hiding(RDH)method,which is designed by combining histogram modification(HM)with run-level coding in H.264/advanced video coding(AVC).In this scheme,the run-level is changed for embedding data into H.264/AVC video sequences.In order to guarantee the reversibility of the proposed scheme,the last nonzero quantized discrete cosine transform(DCT)coefficients in embeddable 4×4 blocks are shifted by the technology of histogram modification.The proposed scheme is realized after quantization and before entropy coding of H.264/AVC compression standard.Therefore,the embedded information can be correctly extracted at the decoding side.Peak-signal-noise-to-ratio(PSNR)and Structure similarity index(SSIM),embedding payload and bit-rate variation are exploited to measure the performance of the proposed scheme.Experimental results have shown that the proposed scheme leads to less SSIM variation and bit-rate increase.
文摘In this paper, we propose a new method for very low bit-rate video coding that combines H.264/AVC standard and two-dimensional discrete wavelet transform. In this method, first a two dimensional wavelet transform is applied on each video frame independently to extract the low frequency components for each frame and then the low frequency parts of all frames are coded using H.264/AVC codec. On the other hand, the high frequency parts of the video frames are coded by Run Length Coding algorithm, after applying a threshold to neglect the low value coefficients. Experiments show that our proposed method can achieve better rate-distortion performance at very low bit-rate applications below 16 kbits/s compared to applying H.264/AVC standard directly to all frames. Applications of our proposed video coding technique include video telephony, video-conferencing, transmitting or receiving video over half-rate traffic channels of GSM networks.
文摘In this article, a spatio-temporal post-processing error concealment algorithm designed initially for a H.264 video-streaming scheme over packet-lossy networks has been presented. It aims at optimizing subjective quality of restored video and the conventional objective metric, peak signal-to-noise ratio (PSNR), as well, under the constraints of low delay and computational complexity, which are critical to real-time applications and portable devices having limited resources. Specifically, it takes into consideration physical property of motion to achieve more meaningful perceptual video quality. Further, a content-adaptive bilinear spatial interpolation approach and a temporal error concealment approach are combined under a unified boundary match criterion based on texture and motion activity analysis. Extensive experiments have demonstrated that the proposal not only result in better reconstruction, objectively and subjectively, than the reference software model benchmark, but also results in better robustness to different video sequences.