Space-time video super-resolution(STVSR)serves the purpose to reconstruct high-resolution high-frame-rate videos from their low-resolution low-frame-rate counterparts.Recent approaches utilize end-to-end deep learning...Space-time video super-resolution(STVSR)serves the purpose to reconstruct high-resolution high-frame-rate videos from their low-resolution low-frame-rate counterparts.Recent approaches utilize end-to-end deep learning models to achieve STVSR.They first interpolate intermediate frame features between given frames,then perform local and global refinement among the feature sequence,and finally increase the spatial resolutions of these features.However,in the most important feature interpolation phase,they only capture spatial-temporal information from the most adjacent frame features,ignoring modelling long-term spatial-temporal correlations between multiple neighbouring frames to restore variable-speed object movements and maintain long-term motion continuity.In this paper,we propose a novel long-term temporal feature aggregation network(LTFA-Net)for STVSR.Specifically,we design a long-term mixture of experts(LTMoE)module for feature interpolation.LTMoE contains multiple experts to extract mutual and complementary spatial-temporal information from multiple consecutive adjacent frame features,which are then combined with different weights to obtain interpolation results using several gating nets.Next,we perform local and global feature refinement using the Locally-temporal Feature Comparison(LFC)module and bidirectional deformable ConvLSTM layer,respectively.Experimental results on two standard benchmarks,Adobe240 and GoPro,indicate the effectiveness and superiority of our approach over state of the art.展开更多
Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead t...Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead to unsatisfactory artifacts.We found that in real-world VSR training,the use of unknown and complex degradation can better simulate the degradation process in the real world.Methods Based on this,we propose the RealFuVSR model,which simulates real-world degradation and mitigates artifacts caused by the VSR.Specifically,we propose a multiscale feature extraction module(MSF)module that extracts and fuses features from multiple scales,thereby facilitating the elimination of hidden state artifacts.To improve the accuracy of the hidden state alignment information,RealFuVSR uses an advanced optical flow-guided deformable convolution.Moreover,a cascaded residual upsampling module was used to eliminate noise caused by the upsampling process.Results The experiment demonstrates that RealFuVSR model can not only recover high-quality videos but also outperforms the state-of-the-art RealBasicVSR and RealESRGAN models.展开更多
Video Super-Resolution (SR) reconstruction produces video sequences with High Resolution (HR) via the fusion of several Low-Resolution (LR) video frames. Traditional methods rely on the accurate estimation of su...Video Super-Resolution (SR) reconstruction produces video sequences with High Resolution (HR) via the fusion of several Low-Resolution (LR) video frames. Traditional methods rely on the accurate estimation of subpixel motion, which constrains their applicability to video sequences with relatively simple motions such as global translation. We propose an efficient iterative spatio-temporal adaptive SR reconstruction model based on Zemike Moment (ZM), which is effective for spatial video sequences with arbitrary motion. The model uses region correlation judgment and self-adaptive threshold strategies to improve the effect and time efficiency of the ZM-based SR method. This leads to better mining of non-local self-similarity and local structural regularity, and is robust to noise and rotation. An efficient iterative curvature-based interpolation scheme is introduced to obtain the initial HR estimation of each LR video frame. Experimental results both on spatial and standard video sequences demonstrate that the proposed method outperforms existing methods in terms of both subjective visual and objective quantitative evaluations, and greatly improves the time efficiency.展开更多
Super-Resolution (SR) technique means to reconstruct High-Resolution (HR) images from a sequence of Low-Resolution (LR) observations,which has been a great focus for compressed video. Based on the theory of Projection...Super-Resolution (SR) technique means to reconstruct High-Resolution (HR) images from a sequence of Low-Resolution (LR) observations,which has been a great focus for compressed video. Based on the theory of Projection Onto Convex Set (POCS),this paper constructs Quantization Constraint Set (QCS) using the quantization information extracted from the video bit stream. By combining the statistical properties of image and the Human Visual System (HVS),a novel Adaptive Quantization Constraint Set (AQCS) is proposed. Simulation results show that AQCS-based SR al-gorithm converges at a fast rate and obtains better performance in both objective and subjective quality,which is applicable for compressed video.展开更多
This letter proposes a novel method of compressed video super-resolution reconstruction based on MAP-POCS (Maximum Posterior Probability-Projection Onto Convex Set). At first assuming the high-resolution model subject...This letter proposes a novel method of compressed video super-resolution reconstruction based on MAP-POCS (Maximum Posterior Probability-Projection Onto Convex Set). At first assuming the high-resolution model subject to Poisson-Markov distribution, then constructing the projecting convex based on MAP. According to the characteristics of compressed video, two different convexes are constructed based on integrating the inter-frame and intra-frame information in the wavelet-domain. The results of the experiment demonstrate that the new method not only outperforms the traditional algorithms on the aspects of PSNR (Peak Signal-to-Noise Ratio), MSE (Mean Square Error) and reconstruction vision effect, but also has the advantages of rapid convergence and easy extension.展开更多
Recent technological developments have resulted in surveillance video becoming a primary method of preserving public security.Many city crimes are observed in surveillance video.The most abundant evidence collected by...Recent technological developments have resulted in surveillance video becoming a primary method of preserving public security.Many city crimes are observed in surveillance video.The most abundant evidence collected by the police is also acquired through surveillance video sources.Surveillance video footage offers very strong support for solving criminal cases,therefore,creating an effective policy and applying useful methods to the retrieval of additional evidence is becoming increasingly important.However,surveillance video has had its failings,namely,video footage being captured in low resolution(LR)and bad visual quality.In this paper,we discuss the characteristics of surveillance video and describe the manual feature registration-maximum a posteriori-projection onto convex sets to develop a super-resolution reconstruction method,which improves the quality of surveillance video.From this method,we can make optimal use of information contained in the LR video image,but we can also control the image edge clearly as well as the convergence of the algorithm.Finally,we make a suggestion on how to adjust the algorithm adaptability by analyzing the prior information of target image.展开更多
文摘Space-time video super-resolution(STVSR)serves the purpose to reconstruct high-resolution high-frame-rate videos from their low-resolution low-frame-rate counterparts.Recent approaches utilize end-to-end deep learning models to achieve STVSR.They first interpolate intermediate frame features between given frames,then perform local and global refinement among the feature sequence,and finally increase the spatial resolutions of these features.However,in the most important feature interpolation phase,they only capture spatial-temporal information from the most adjacent frame features,ignoring modelling long-term spatial-temporal correlations between multiple neighbouring frames to restore variable-speed object movements and maintain long-term motion continuity.In this paper,we propose a novel long-term temporal feature aggregation network(LTFA-Net)for STVSR.Specifically,we design a long-term mixture of experts(LTMoE)module for feature interpolation.LTMoE contains multiple experts to extract mutual and complementary spatial-temporal information from multiple consecutive adjacent frame features,which are then combined with different weights to obtain interpolation results using several gating nets.Next,we perform local and global feature refinement using the Locally-temporal Feature Comparison(LFC)module and bidirectional deformable ConvLSTM layer,respectively.Experimental results on two standard benchmarks,Adobe240 and GoPro,indicate the effectiveness and superiority of our approach over state of the art.
基金Supported by Open Project of the Ministry of Industry and Information Technology Key Laboratory of Performance and Reliability Testing and Evaluation for Basic Software and Hardware。
文摘Background Recurrent recovery is a common method for video super-resolution(VSR)that models the correlation between frames via hidden states.However,the application of this structure in real-world scenarios can lead to unsatisfactory artifacts.We found that in real-world VSR training,the use of unknown and complex degradation can better simulate the degradation process in the real world.Methods Based on this,we propose the RealFuVSR model,which simulates real-world degradation and mitigates artifacts caused by the VSR.Specifically,we propose a multiscale feature extraction module(MSF)module that extracts and fuses features from multiple scales,thereby facilitating the elimination of hidden state artifacts.To improve the accuracy of the hidden state alignment information,RealFuVSR uses an advanced optical flow-guided deformable convolution.Moreover,a cascaded residual upsampling module was used to eliminate noise caused by the upsampling process.Results The experiment demonstrates that RealFuVSR model can not only recover high-quality videos but also outperforms the state-of-the-art RealBasicVSR and RealESRGAN models.
基金the National Basic Research Program of China (973 Program) under Grant No.2012CB821200,the National Natural Science Foundation of China under Grants No.91024001,No.61070142,the Beijing Natural Science Foundation under Grant No.4111002
文摘Video Super-Resolution (SR) reconstruction produces video sequences with High Resolution (HR) via the fusion of several Low-Resolution (LR) video frames. Traditional methods rely on the accurate estimation of subpixel motion, which constrains their applicability to video sequences with relatively simple motions such as global translation. We propose an efficient iterative spatio-temporal adaptive SR reconstruction model based on Zemike Moment (ZM), which is effective for spatial video sequences with arbitrary motion. The model uses region correlation judgment and self-adaptive threshold strategies to improve the effect and time efficiency of the ZM-based SR method. This leads to better mining of non-local self-similarity and local structural regularity, and is robust to noise and rotation. An efficient iterative curvature-based interpolation scheme is introduced to obtain the initial HR estimation of each LR video frame. Experimental results both on spatial and standard video sequences demonstrate that the proposed method outperforms existing methods in terms of both subjective visual and objective quantitative evaluations, and greatly improves the time efficiency.
基金the Natural Science Foundation of Jiangsu Province (No.BK2004151).
文摘Super-Resolution (SR) technique means to reconstruct High-Resolution (HR) images from a sequence of Low-Resolution (LR) observations,which has been a great focus for compressed video. Based on the theory of Projection Onto Convex Set (POCS),this paper constructs Quantization Constraint Set (QCS) using the quantization information extracted from the video bit stream. By combining the statistical properties of image and the Human Visual System (HVS),a novel Adaptive Quantization Constraint Set (AQCS) is proposed. Simulation results show that AQCS-based SR al-gorithm converges at a fast rate and obtains better performance in both objective and subjective quality,which is applicable for compressed video.
基金Supported by the Natural Science Foundation of Jiangsu Province (No. BK2004151).
文摘This letter proposes a novel method of compressed video super-resolution reconstruction based on MAP-POCS (Maximum Posterior Probability-Projection Onto Convex Set). At first assuming the high-resolution model subject to Poisson-Markov distribution, then constructing the projecting convex based on MAP. According to the characteristics of compressed video, two different convexes are constructed based on integrating the inter-frame and intra-frame information in the wavelet-domain. The results of the experiment demonstrate that the new method not only outperforms the traditional algorithms on the aspects of PSNR (Peak Signal-to-Noise Ratio), MSE (Mean Square Error) and reconstruction vision effect, but also has the advantages of rapid convergence and easy extension.
文摘Recent technological developments have resulted in surveillance video becoming a primary method of preserving public security.Many city crimes are observed in surveillance video.The most abundant evidence collected by the police is also acquired through surveillance video sources.Surveillance video footage offers very strong support for solving criminal cases,therefore,creating an effective policy and applying useful methods to the retrieval of additional evidence is becoming increasingly important.However,surveillance video has had its failings,namely,video footage being captured in low resolution(LR)and bad visual quality.In this paper,we discuss the characteristics of surveillance video and describe the manual feature registration-maximum a posteriori-projection onto convex sets to develop a super-resolution reconstruction method,which improves the quality of surveillance video.From this method,we can make optimal use of information contained in the LR video image,but we can also control the image edge clearly as well as the convergence of the algorithm.Finally,we make a suggestion on how to adjust the algorithm adaptability by analyzing the prior information of target image.