High-resolution video transmission requires a substantial amount of bandwidth.In this paper,we present a novel video processing methodology that innovatively integrates region of interest(ROI)identification and super-...High-resolution video transmission requires a substantial amount of bandwidth.In this paper,we present a novel video processing methodology that innovatively integrates region of interest(ROI)identification and super-resolution enhancement.Our method commences with the accurate detection of ROIs within video sequences,followed by the application of advanced super-resolution techniques to these areas,thereby preserving visual quality while economizing on data transmission.To validate and benchmark our approach,we have curated a new gaming dataset tailored to evaluate the effectiveness of ROI-based super-resolution in practical applications.The proposed model architecture leverages the transformer network framework,guided by a carefully designed multi-task loss function,which facilitates concurrent learning and execution of both ROI identification and resolution enhancement tasks.This unified deep learning model exhibits remarkable performance in achieving super-resolution on our custom dataset.The implications of this research extend to optimizing low-bitrate video streaming scenarios.By selectively enhancing the resolution of critical regions in videos,our solution enables high-quality video delivery under constrained bandwidth conditions.Empirical results demonstrate a 15%reduction in transmission bandwidth compared to traditional super-resolution based compression methods,without any perceivable decline in visual quality.This work thus contributes to the advancement of video compression and enhancement technologies,offering an effective strategy for improving digital media delivery efficiency and user experience,especially in bandwidth-limited environments.The innovative integration of ROI identification and super-resolution presents promising avenues for future research and development in adaptive and intelligent video communication systems.展开更多
To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advan...To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.展开更多
Many important developments in video compression technologies have occurred during the past two decades. The block-based discrete cosine transform with motion compensation hybrid coding scheme has been widely employed...Many important developments in video compression technologies have occurred during the past two decades. The block-based discrete cosine transform with motion compensation hybrid coding scheme has been widely employed by most available video coding standards, notably the ITU-T H.26x and ISO/IEC MPEG-x families and video part of China audio video coding standard (AVS). The objective of this paper is to provide a review of the developments of the four basic building blocks of hybrid coding scheme, namely predictive coding, transform coding, quantization and entropy coding, and give theoretical analyses and summaries of the technological advancements. We further analyze the development trends and perspectives of video com- pression, highlighting problems and research directions.展开更多
The evolution of social network and multimedia technologies encourage more and more people to generate and upload visual information, which leads to the generation of large-scale video data. Therefore, preeminent comp...The evolution of social network and multimedia technologies encourage more and more people to generate and upload visual information, which leads to the generation of large-scale video data. Therefore, preeminent compression technologies are highly desired to facilitate the storage and transmission of these tremendous video data for a wide variety of applications. In this paper, a systematic review of the recent advances for large-scale video compression (LSVC) is presented. Specifically, fast video coding algorithms and effective models to improve video compression efficiency are introduced in detail, since coding complexity and compression efficiency are two important factors to evaluate video coding approaches. Finally, the challenges and fu- ture research trends for LSVC are discussed.展开更多
In this paper, we summarize 3D perception-oriented algorithms for perceptually driven 3D video coding. Several perceptual ef- fects have been exploited for 2D video viewing; however, this is not yet the case for 3D vi...In this paper, we summarize 3D perception-oriented algorithms for perceptually driven 3D video coding. Several perceptual ef- fects have been exploited for 2D video viewing; however, this is not yet the case for 3D video viewing. 3D video requires depth perception, which implies binocular effects such as con fl icts, fusion, and rivalry. A better understanding of these effects is necessary for 3D perceptual compression, which provides users with a more comfortable visual experience for video that is de- livered over a channel with limited bandwidth. We present state-of-the-art of 3D visual attention models, 3D just-notice- able difference models, and 3D texture-synthesis models that address 3D human vision issues in 3D video coding and trans-mission.展开更多
Video compression in medical video streaming is one of the key technologies associated with mobile healthcare.Seamless delivery of medical video streams over a resource constrained network emphasizes the need of a vid...Video compression in medical video streaming is one of the key technologies associated with mobile healthcare.Seamless delivery of medical video streams over a resource constrained network emphasizes the need of a video codec that requires minimum bitrates and maintains high perceptual quality.This paper presents a comparative study between High Efciency Video Coding(HEVC)and its potential successor Versatile Video Coding(VVC)in the context of healthcare.A large-scale subjective experiment comprising of twenty-four non-expert participants is presented for eight different test conditions in Full High Denition(FHD)videos.The presented analysis highlights the impact of compression artefacts on the perceptual quality of HEVC and VVC processed videos.Our results and ndings show that VVC clearly outperforms HEVC in terms of achieving higher compression,while maintaining high quality in FHD videos.VVC requires upto 40%less bitrate for encoding an FHD video at excellent perceptual quality.We have provided rate-quality curves for both encoders and a degree of overlap across both codecs in terms of perceptual quality.Overall,there is a 71%degree of overlap in terms of quality between VVC and HEVC compressed videos for eight different test conditions.展开更多
Two video coding schemes based on wavelet transform achieving very low bit rate are presented in this paper. The first is a hybrid motion compensated wavelet transform(MC WT)system which behaves better at very low ...Two video coding schemes based on wavelet transform achieving very low bit rate are presented in this paper. The first is a hybrid motion compensated wavelet transform(MC WT)system which behaves better at very low bit rates than the block DCT residual coder. The second is a new efficient coding system based on a simple frame differencing wavelet transform(FD WT)which performs well in both PSNR and visual quality with substantially reduced complexity.展开更多
A new improved Goh's 3 D wavelet transform(WT) coding scheme is presented in this paper. The new scheme has great advantages including a simple code structure, low computation cost and good performance in PSNR, c...A new improved Goh's 3 D wavelet transform(WT) coding scheme is presented in this paper. The new scheme has great advantages including a simple code structure, low computation cost and good performance in PSNR, compression ratios and visual quality of reconstructions, when compared to the other existing 3 D WT coding methods and the 2 D WT based coding methods. The new 3 D WT coding scheme is suitable for very low bit rate video coding.展开更多
The high-efficiency video coder(HEVC)is one of the most advanced techniques used in growing real-time multimedia applications today.However,they require large bandwidth for transmission through bandwidth,and bandwidth...The high-efficiency video coder(HEVC)is one of the most advanced techniques used in growing real-time multimedia applications today.However,they require large bandwidth for transmission through bandwidth,and bandwidth varies with different video sequences/formats.This paper proposes an adaptive information-based variable quantization matrix(AIVQM)developed for different video formats having variable energy levels.The quantization method is adapted based on video sequence using statistical analysis,improving bit budget,quality and complexity reduction.Further,to have precise control over bit rate and quality,a multi-constraint prune algorithm is proposed in the second stage of the AI-VQM technique for pre-calculating K numbers of paths.The same should be handy to selfadapt and choose one of the K-path automatically in dynamically changing bandwidth availability as per requirement after extensive testing of the proposed algorithm in the multi-constraint environment for multiple paths and evaluating the performance based on peak signal to noise ratio(PSNR),bit-budget and time complexity for different videos a noticeable improvement in rate-distortion(RD)performance is achieved.Using the proposed AIVQM technique,more feasible and efficient video sequences are achieved with less loss in PSNR than the variable quantization method(VQM)algorithm with approximately a rise of 10%–20%based on different video sequences/formats.展开更多
High Efficiency Video Coding (HEVC) is the latest international video coding standard, which can provide the similar quality with about half bandwidth compared with its predecessor, H.264/MPEG?4 AVC. To meet the requi...High Efficiency Video Coding (HEVC) is the latest international video coding standard, which can provide the similar quality with about half bandwidth compared with its predecessor, H.264/MPEG?4 AVC. To meet the requirement of higher bit depth coding and more chroma sampling formats, range extensions of HEVC were developed. This paper introduces the coding tools in HEVC range extensions and provides experimental results to compare HEVC range extensions with previous video coding standards. Ex?perimental results show that HEVC range extensions improve coding efficiency much over H.264/MPEG?4 AVC High Predictive profile, especially for 4K sequences.展开更多
Discrete Cosine Transform(DCT)is the most widely used technique in image and video compression.In this paper,the structure of DCT and Inverse DCT(IDCT)algorithm is split in the form of COordinate Rotation DIgital Comp...Discrete Cosine Transform(DCT)is the most widely used technique in image and video compression.In this paper,the structure of DCT and Inverse DCT(IDCT)algorithm is split in the form of COordinate Rotation DIgital Computer(CORDIC)rotation matrix.The two-dimensional(2-D)8×8 DCT/IDCT units based on the improved rotation CORDIC algorithm is proposed.The shift and addition operations of the CORDIC algorithm are used to replace the cosine multiplication operations in the algorithm.The design does not contain any multiplier unit,which reduces the complexity of the hardware unit.The row-column transform unit composed of register arrays connects two 1-D 8-point DCT units to complete the calculation of 2-D 8×8 DCT.The pipeline latency of proposed architecture is 28 clock cycles.The proposed efficient two-dimensional DCT architecture has been synthesized on the Xilinx’s Kintex-7 FPGA.The resource utilization is 17.36%for Slice LUTs,3.49%for Slice Registers,and the maximum operating frequency is 172 MHz.It takes only 0.161μs to complete a process of block of 8×8 samples.A frame of image is processed by the designed DCT unit and then reconstructed by the IDCT unit to verify the function.The Peak Signal to Noise Ratio(PSNR)can reach 51.99 dB.展开更多
Super-Resolution (SR) technique means to reconstruct High-Resolution (HR) images from a sequence of Low-Resolution (LR) observations,which has been a great focus for compressed video. Based on the theory of Projection...Super-Resolution (SR) technique means to reconstruct High-Resolution (HR) images from a sequence of Low-Resolution (LR) observations,which has been a great focus for compressed video. Based on the theory of Projection Onto Convex Set (POCS),this paper constructs Quantization Constraint Set (QCS) using the quantization information extracted from the video bit stream. By combining the statistical properties of image and the Human Visual System (HVS),a novel Adaptive Quantization Constraint Set (AQCS) is proposed. Simulation results show that AQCS-based SR al-gorithm converges at a fast rate and obtains better performance in both objective and subjective quality,which is applicable for compressed video.展开更多
This letter proposes a novel method of compressed video super-resolution reconstruction based on MAP-POCS (Maximum Posterior Probability-Projection Onto Convex Set). At first assuming the high-resolution model subject...This letter proposes a novel method of compressed video super-resolution reconstruction based on MAP-POCS (Maximum Posterior Probability-Projection Onto Convex Set). At first assuming the high-resolution model subject to Poisson-Markov distribution, then constructing the projecting convex based on MAP. According to the characteristics of compressed video, two different convexes are constructed based on integrating the inter-frame and intra-frame information in the wavelet-domain. The results of the experiment demonstrate that the new method not only outperforms the traditional algorithms on the aspects of PSNR (Peak Signal-to-Noise Ratio), MSE (Mean Square Error) and reconstruction vision effect, but also has the advantages of rapid convergence and easy extension.展开更多
Extraction of traffic information from image or video sequence is a hot research topic in intelligenttransportation system and computer vision. A real-time traffic information extraction method based on com-pressed vi...Extraction of traffic information from image or video sequence is a hot research topic in intelligenttransportation system and computer vision. A real-time traffic information extraction method based on com-pressed video with interframe motion vectors for speed, density and flow detection, has been proposed for ex-traction of traffic information under fixed camera setting and well-defined environment. The motion vectors arefirst separated from the compressed video streams, and then filtered to eliminate incorrect and noisy vectors u-sing the well-defined environmental knowledge. By applying the projective transform and using the filtered mo-tion vectors, speed can be calculated from motion vector statistics, density can be estimated using the motionvector occupancy, and flow can be detected using the combination of speed and density. The embodiment of aprototype system for sky camera traffic monitoring using the MPEG video has been implemented, and experi-mental results proved the effectiveness of the method proposed.展开更多
This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based o...This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based on camera motion analysis. A new algorithm for fast camera motion estimation in compressed domain is presented. In the retrieval process, camera-motion-based semantic retrieval is built. To improve the coverage of the proposed scheme, close-up retrieval in all kinds of videos is investigated. Extensive experiments illustrate that the proposed scheme provides promising retrieval results under real-time and automatic application scenario.展开更多
Video reconstruction quality largely depends on the ability of employed sparse domain to adequately represent the underlying video in Distributed Compressed Video Sensing (DCVS). In this paper, we propose a novel dyna...Video reconstruction quality largely depends on the ability of employed sparse domain to adequately represent the underlying video in Distributed Compressed Video Sensing (DCVS). In this paper, we propose a novel dynamic global-Principal Component Analysis (PCA) sparse representation algorithm for video based on the sparse-land model and nonlocal similarity. First, grouping by matching is realized at the decoder from key frames that are previously recovered. Second, we apply PCA to each group (sub-dataset) to compute the principle components from which the sub-dictionary is constructed. Finally, the non-key frames are reconstructed from random measurement data using a Compressed Sensing (CS) reconstruction algorithm with sparse regularization. Experimental results show that our algorithm has a better performance compared with the DCT and K-SVD dictionaries.展开更多
In this paper,a video compressed sensing reconstruction algorithm based on multidimensional reference frames is proposed using the sparse characteristics of video signals in different sparse representation domains.Fir...In this paper,a video compressed sensing reconstruction algorithm based on multidimensional reference frames is proposed using the sparse characteristics of video signals in different sparse representation domains.First,the overall structure of the proposed video compressed sensing algorithm is introduced in this paper.The paper adopts a multi-reference frame bidirectional prediction hypothesis optimization algorithm.Then,the paper proposes a reconstruction method for CS frames at the re-decoding end.In addition to using key frames of each GOP reconstructed in the time domain as reference frames for reconstructing CS frames,half-pixel reference frames and scaled reference frames in the pixel domain are also used as CS frames.Reference frames of CS frames are used to obtain higher quality assumptions.Themethod of obtaining reference frames in the pixel domain is also discussed in detail in this paper.Finally,the reconstruction algorithm proposed in this paper is compared with video compression algorithms in the literature that have better reconstruction results.Experiments show that the algorithm has better performance than the best multi-reference frame video compression sensing algorithm and can effectively improve the quality of slowmotion video reconstruction.展开更多
Studies show that encoding technologies in H.264/AVC,including prediction and conversion,are essential technologies.However,these technologies are more complicated than the MPEG-4,which is a standard method and widely...Studies show that encoding technologies in H.264/AVC,including prediction and conversion,are essential technologies.However,these technologies are more complicated than the MPEG-4,which is a standard method and widely adopted worldwide.Therefore,the amount of calculation in H.264/AVC is significantly up-regulated compared to that of the MPEG-4.In the present study,it is intended to simplify the computational expenses in the international standard compression coding system H.264/AVC for moving images.Inter prediction refers to the most feasible compression technology,taking up to 60%of the entire encoding.In this regard,prediction error and motion vector information are proposed to simplify the computation of inter predictive coding technology.In the initial frame,motion compensation is performed in all target modes and then basic information is collected and analyzed.After the initial frame,motion compensation is performed only in the middle 8×8 modes,and the basic information amount shifts.In order to evaluate the effectiveness of the proposed method and assess the motion image compression coding,four types of motion images,defined by the international telecommunication union(ITU),are employed.Based on the obtained results,it is concluded that the developed method is capable of simplifying the calculation,while it is slightly affected by the inferior image quality and the amount of information.展开更多
A new faster block-matching algorithm (BMA) by using both search candidate and pixd sulzsamplings is proposed. Firstly a pixd-subsampling approach used in adjustable partial distortion search (APDS) is adjusted to...A new faster block-matching algorithm (BMA) by using both search candidate and pixd sulzsamplings is proposed. Firstly a pixd-subsampling approach used in adjustable partial distortion search (APDS) is adjusted to visit about half points of all search candidates by subsampling them, using a spiral-scanning path with one skip. Two sdected candidates that have minimal and second minimal block distortion measures are obtained. Then a fine-tune step is taken around them to find the best one. Some analyses are given to approve the rationality of the approach of this paper. Experimental results show that, as compared to APDS, the proposed algorithm can enhance the block-matching speed by about 30% while maintaining its MSE performance very close to that of it. And it performs much better than many other BMAs such as TSS, NTSS, UCDBS and NPDS.展开更多
It is known by entropy theory that image is a source correlated with a certain characteristic of probability. The entropy rate of the source and ε- entropy (rate-distortion function theory) are the information conten...It is known by entropy theory that image is a source correlated with a certain characteristic of probability. The entropy rate of the source and ε- entropy (rate-distortion function theory) are the information content to identify the characteristics of video images, and hence are essentially related with video image compression. They are fundamental theories of great significance to image compression, though impossible to be directly turned into a compression method. Based on the entropy theory and the image compression theory, by the application of the rate-distortion feature mathematical model and Lagrange multipliers to some theoretical problems in the H.264 standard, this paper presents a new the algorithm model of coding rate-distortion. This model is introduced into complete test on the capability of the test model of JM61e (JUT Test Model). The result shows that the speed of coding increases without significant reduction of the rate-distortion performance of the coder.展开更多
基金funded by National Key Research and Development Program of China(No.2022YFC3302103).
文摘High-resolution video transmission requires a substantial amount of bandwidth.In this paper,we present a novel video processing methodology that innovatively integrates region of interest(ROI)identification and super-resolution enhancement.Our method commences with the accurate detection of ROIs within video sequences,followed by the application of advanced super-resolution techniques to these areas,thereby preserving visual quality while economizing on data transmission.To validate and benchmark our approach,we have curated a new gaming dataset tailored to evaluate the effectiveness of ROI-based super-resolution in practical applications.The proposed model architecture leverages the transformer network framework,guided by a carefully designed multi-task loss function,which facilitates concurrent learning and execution of both ROI identification and resolution enhancement tasks.This unified deep learning model exhibits remarkable performance in achieving super-resolution on our custom dataset.The implications of this research extend to optimizing low-bitrate video streaming scenarios.By selectively enhancing the resolution of critical regions in videos,our solution enables high-quality video delivery under constrained bandwidth conditions.Empirical results demonstrate a 15%reduction in transmission bandwidth compared to traditional super-resolution based compression methods,without any perceivable decline in visual quality.This work thus contributes to the advancement of video compression and enhancement technologies,offering an effective strategy for improving digital media delivery efficiency and user experience,especially in bandwidth-limited environments.The innovative integration of ROI identification and super-resolution presents promising avenues for future research and development in adaptive and intelligent video communication systems.
基金supported by ZTE Industry-University-Institute Cooperation Funds.
文摘To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.
基金Project (No. 2009CB320903) supported by the National Basic Research Program (973) of China
文摘Many important developments in video compression technologies have occurred during the past two decades. The block-based discrete cosine transform with motion compensation hybrid coding scheme has been widely employed by most available video coding standards, notably the ITU-T H.26x and ISO/IEC MPEG-x families and video part of China audio video coding standard (AVS). The objective of this paper is to provide a review of the developments of the four basic building blocks of hybrid coding scheme, namely predictive coding, transform coding, quantization and entropy coding, and give theoretical analyses and summaries of the technological advancements. We further analyze the development trends and perspectives of video com- pression, highlighting problems and research directions.
基金This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61622115 and 61472281), the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning (GZ2015005), and Shanghai Engineering Research Center of Industrial Vision Perception & Intelligent Computing ( 17DZ2251600).
文摘The evolution of social network and multimedia technologies encourage more and more people to generate and upload visual information, which leads to the generation of large-scale video data. Therefore, preeminent compression technologies are highly desired to facilitate the storage and transmission of these tremendous video data for a wide variety of applications. In this paper, a systematic review of the recent advances for large-scale video compression (LSVC) is presented. Specifically, fast video coding algorithms and effective models to improve video compression efficiency are introduced in detail, since coding complexity and compression efficiency are two important factors to evaluate video coding approaches. Finally, the challenges and fu- ture research trends for LSVC are discussed.
文摘In this paper, we summarize 3D perception-oriented algorithms for perceptually driven 3D video coding. Several perceptual ef- fects have been exploited for 2D video viewing; however, this is not yet the case for 3D video viewing. 3D video requires depth perception, which implies binocular effects such as con fl icts, fusion, and rivalry. A better understanding of these effects is necessary for 3D perceptual compression, which provides users with a more comfortable visual experience for video that is de- livered over a channel with limited bandwidth. We present state-of-the-art of 3D visual attention models, 3D just-notice- able difference models, and 3D texture-synthesis models that address 3D human vision issues in 3D video coding and trans-mission.
基金supported by Innovate UK,which is a part of UK Research&Innovation,and Pangea Connected Ltd.,under the Knowledge Transfer Partnership(KTP)program(Project No.11433)。
文摘Video compression in medical video streaming is one of the key technologies associated with mobile healthcare.Seamless delivery of medical video streams over a resource constrained network emphasizes the need of a video codec that requires minimum bitrates and maintains high perceptual quality.This paper presents a comparative study between High Efciency Video Coding(HEVC)and its potential successor Versatile Video Coding(VVC)in the context of healthcare.A large-scale subjective experiment comprising of twenty-four non-expert participants is presented for eight different test conditions in Full High Denition(FHD)videos.The presented analysis highlights the impact of compression artefacts on the perceptual quality of HEVC and VVC processed videos.Our results and ndings show that VVC clearly outperforms HEVC in terms of achieving higher compression,while maintaining high quality in FHD videos.VVC requires upto 40%less bitrate for encoding an FHD video at excellent perceptual quality.We have provided rate-quality curves for both encoders and a degree of overlap across both codecs in terms of perceptual quality.Overall,there is a 71%degree of overlap in terms of quality between VVC and HEVC compressed videos for eight different test conditions.
文摘Two video coding schemes based on wavelet transform achieving very low bit rate are presented in this paper. The first is a hybrid motion compensated wavelet transform(MC WT)system which behaves better at very low bit rates than the block DCT residual coder. The second is a new efficient coding system based on a simple frame differencing wavelet transform(FD WT)which performs well in both PSNR and visual quality with substantially reduced complexity.
文摘A new improved Goh's 3 D wavelet transform(WT) coding scheme is presented in this paper. The new scheme has great advantages including a simple code structure, low computation cost and good performance in PSNR, compression ratios and visual quality of reconstructions, when compared to the other existing 3 D WT coding methods and the 2 D WT based coding methods. The new 3 D WT coding scheme is suitable for very low bit rate video coding.
文摘The high-efficiency video coder(HEVC)is one of the most advanced techniques used in growing real-time multimedia applications today.However,they require large bandwidth for transmission through bandwidth,and bandwidth varies with different video sequences/formats.This paper proposes an adaptive information-based variable quantization matrix(AIVQM)developed for different video formats having variable energy levels.The quantization method is adapted based on video sequence using statistical analysis,improving bit budget,quality and complexity reduction.Further,to have precise control over bit rate and quality,a multi-constraint prune algorithm is proposed in the second stage of the AI-VQM technique for pre-calculating K numbers of paths.The same should be handy to selfadapt and choose one of the K-path automatically in dynamically changing bandwidth availability as per requirement after extensive testing of the proposed algorithm in the multi-constraint environment for multiple paths and evaluating the performance based on peak signal to noise ratio(PSNR),bit-budget and time complexity for different videos a noticeable improvement in rate-distortion(RD)performance is achieved.Using the proposed AIVQM technique,more feasible and efficient video sequences are achieved with less loss in PSNR than the variable quantization method(VQM)algorithm with approximately a rise of 10%–20%based on different video sequences/formats.
文摘High Efficiency Video Coding (HEVC) is the latest international video coding standard, which can provide the similar quality with about half bandwidth compared with its predecessor, H.264/MPEG?4 AVC. To meet the requirement of higher bit depth coding and more chroma sampling formats, range extensions of HEVC were developed. This paper introduces the coding tools in HEVC range extensions and provides experimental results to compare HEVC range extensions with previous video coding standards. Ex?perimental results show that HEVC range extensions improve coding efficiency much over H.264/MPEG?4 AVC High Predictive profile, especially for 4K sequences.
文摘Discrete Cosine Transform(DCT)is the most widely used technique in image and video compression.In this paper,the structure of DCT and Inverse DCT(IDCT)algorithm is split in the form of COordinate Rotation DIgital Computer(CORDIC)rotation matrix.The two-dimensional(2-D)8×8 DCT/IDCT units based on the improved rotation CORDIC algorithm is proposed.The shift and addition operations of the CORDIC algorithm are used to replace the cosine multiplication operations in the algorithm.The design does not contain any multiplier unit,which reduces the complexity of the hardware unit.The row-column transform unit composed of register arrays connects two 1-D 8-point DCT units to complete the calculation of 2-D 8×8 DCT.The pipeline latency of proposed architecture is 28 clock cycles.The proposed efficient two-dimensional DCT architecture has been synthesized on the Xilinx’s Kintex-7 FPGA.The resource utilization is 17.36%for Slice LUTs,3.49%for Slice Registers,and the maximum operating frequency is 172 MHz.It takes only 0.161μs to complete a process of block of 8×8 samples.A frame of image is processed by the designed DCT unit and then reconstructed by the IDCT unit to verify the function.The Peak Signal to Noise Ratio(PSNR)can reach 51.99 dB.
基金the Natural Science Foundation of Jiangsu Province (No.BK2004151).
文摘Super-Resolution (SR) technique means to reconstruct High-Resolution (HR) images from a sequence of Low-Resolution (LR) observations,which has been a great focus for compressed video. Based on the theory of Projection Onto Convex Set (POCS),this paper constructs Quantization Constraint Set (QCS) using the quantization information extracted from the video bit stream. By combining the statistical properties of image and the Human Visual System (HVS),a novel Adaptive Quantization Constraint Set (AQCS) is proposed. Simulation results show that AQCS-based SR al-gorithm converges at a fast rate and obtains better performance in both objective and subjective quality,which is applicable for compressed video.
基金Supported by the Natural Science Foundation of Jiangsu Province (No. BK2004151).
文摘This letter proposes a novel method of compressed video super-resolution reconstruction based on MAP-POCS (Maximum Posterior Probability-Projection Onto Convex Set). At first assuming the high-resolution model subject to Poisson-Markov distribution, then constructing the projecting convex based on MAP. According to the characteristics of compressed video, two different convexes are constructed based on integrating the inter-frame and intra-frame information in the wavelet-domain. The results of the experiment demonstrate that the new method not only outperforms the traditional algorithms on the aspects of PSNR (Peak Signal-to-Noise Ratio), MSE (Mean Square Error) and reconstruction vision effect, but also has the advantages of rapid convergence and easy extension.
文摘Extraction of traffic information from image or video sequence is a hot research topic in intelligenttransportation system and computer vision. A real-time traffic information extraction method based on com-pressed video with interframe motion vectors for speed, density and flow detection, has been proposed for ex-traction of traffic information under fixed camera setting and well-defined environment. The motion vectors arefirst separated from the compressed video streams, and then filtered to eliminate incorrect and noisy vectors u-sing the well-defined environmental knowledge. By applying the projective transform and using the filtered mo-tion vectors, speed can be calculated from motion vector statistics, density can be estimated using the motionvector occupancy, and flow can be detected using the combination of speed and density. The embodiment of aprototype system for sky camera traffic monitoring using the MPEG video has been implemented, and experi-mental results proved the effectiveness of the method proposed.
基金This work was supported by European IST FP6 Research Programme as funded for the Integrated Project:LIVE(No.IST-4-027312).
文摘This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based on camera motion analysis. A new algorithm for fast camera motion estimation in compressed domain is presented. In the retrieval process, camera-motion-based semantic retrieval is built. To improve the coverage of the proposed scheme, close-up retrieval in all kinds of videos is investigated. Extensive experiments illustrate that the proposed scheme provides promising retrieval results under real-time and automatic application scenario.
基金supported by the Innovation Project of Graduate Students of Jiangsu Province, China under Grants No. CXZZ12_0466, No. CXZZ11_0390the National Natural Science Foundation of China under Grants No. 61071091, No. 61271240, No. 61201160, No. 61172118+2 种基金the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province, China under Grant No. 12KJB510019the Science and Technology Research Program of Hubei Provincial Department of Education under Grants No. D20121408, No. D20121402the Program for Research Innovation of Nanjing Institute of Technology Project under Grant No. CKJ20110006
文摘Video reconstruction quality largely depends on the ability of employed sparse domain to adequately represent the underlying video in Distributed Compressed Video Sensing (DCVS). In this paper, we propose a novel dynamic global-Principal Component Analysis (PCA) sparse representation algorithm for video based on the sparse-land model and nonlocal similarity. First, grouping by matching is realized at the decoder from key frames that are previously recovered. Second, we apply PCA to each group (sub-dataset) to compute the principle components from which the sub-dictionary is constructed. Finally, the non-key frames are reconstructed from random measurement data using a Compressed Sensing (CS) reconstruction algorithm with sparse regularization. Experimental results show that our algorithm has a better performance compared with the DCT and K-SVD dictionaries.
文摘In this paper,a video compressed sensing reconstruction algorithm based on multidimensional reference frames is proposed using the sparse characteristics of video signals in different sparse representation domains.First,the overall structure of the proposed video compressed sensing algorithm is introduced in this paper.The paper adopts a multi-reference frame bidirectional prediction hypothesis optimization algorithm.Then,the paper proposes a reconstruction method for CS frames at the re-decoding end.In addition to using key frames of each GOP reconstructed in the time domain as reference frames for reconstructing CS frames,half-pixel reference frames and scaled reference frames in the pixel domain are also used as CS frames.Reference frames of CS frames are used to obtain higher quality assumptions.Themethod of obtaining reference frames in the pixel domain is also discussed in detail in this paper.Finally,the reconstruction algorithm proposed in this paper is compared with video compression algorithms in the literature that have better reconstruction results.Experiments show that the algorithm has better performance than the best multi-reference frame video compression sensing algorithm and can effectively improve the quality of slowmotion video reconstruction.
基金supported by QingLan Project of Jiangsu Province and National Science Fund of China(Nos.61806088,61902160)was supported by Changzhou Science and Technology Support Plan(No.CE20185044).
文摘Studies show that encoding technologies in H.264/AVC,including prediction and conversion,are essential technologies.However,these technologies are more complicated than the MPEG-4,which is a standard method and widely adopted worldwide.Therefore,the amount of calculation in H.264/AVC is significantly up-regulated compared to that of the MPEG-4.In the present study,it is intended to simplify the computational expenses in the international standard compression coding system H.264/AVC for moving images.Inter prediction refers to the most feasible compression technology,taking up to 60%of the entire encoding.In this regard,prediction error and motion vector information are proposed to simplify the computation of inter predictive coding technology.In the initial frame,motion compensation is performed in all target modes and then basic information is collected and analyzed.After the initial frame,motion compensation is performed only in the middle 8×8 modes,and the basic information amount shifts.In order to evaluate the effectiveness of the proposed method and assess the motion image compression coding,four types of motion images,defined by the international telecommunication union(ITU),are employed.Based on the obtained results,it is concluded that the developed method is capable of simplifying the calculation,while it is slightly affected by the inferior image quality and the amount of information.
基金This project was supported by the National Natural Science Foundation of China (60272099) .
文摘A new faster block-matching algorithm (BMA) by using both search candidate and pixd sulzsamplings is proposed. Firstly a pixd-subsampling approach used in adjustable partial distortion search (APDS) is adjusted to visit about half points of all search candidates by subsampling them, using a spiral-scanning path with one skip. Two sdected candidates that have minimal and second minimal block distortion measures are obtained. Then a fine-tune step is taken around them to find the best one. Some analyses are given to approve the rationality of the approach of this paper. Experimental results show that, as compared to APDS, the proposed algorithm can enhance the block-matching speed by about 30% while maintaining its MSE performance very close to that of it. And it performs much better than many other BMAs such as TSS, NTSS, UCDBS and NPDS.
文摘It is known by entropy theory that image is a source correlated with a certain characteristic of probability. The entropy rate of the source and ε- entropy (rate-distortion function theory) are the information content to identify the characteristics of video images, and hence are essentially related with video image compression. They are fundamental theories of great significance to image compression, though impossible to be directly turned into a compression method. Based on the entropy theory and the image compression theory, by the application of the rate-distortion feature mathematical model and Lagrange multipliers to some theoretical problems in the H.264 standard, this paper presents a new the algorithm model of coding rate-distortion. This model is introduced into complete test on the capability of the test model of JM61e (JUT Test Model). The result shows that the speed of coding increases without significant reduction of the rate-distortion performance of the coder.