High-resolution video transmission requires a substantial amount of bandwidth.In this paper,we present a novel video processing methodology that innovatively integrates region of interest(ROI)identification and super-...High-resolution video transmission requires a substantial amount of bandwidth.In this paper,we present a novel video processing methodology that innovatively integrates region of interest(ROI)identification and super-resolution enhancement.Our method commences with the accurate detection of ROIs within video sequences,followed by the application of advanced super-resolution techniques to these areas,thereby preserving visual quality while economizing on data transmission.To validate and benchmark our approach,we have curated a new gaming dataset tailored to evaluate the effectiveness of ROI-based super-resolution in practical applications.The proposed model architecture leverages the transformer network framework,guided by a carefully designed multi-task loss function,which facilitates concurrent learning and execution of both ROI identification and resolution enhancement tasks.This unified deep learning model exhibits remarkable performance in achieving super-resolution on our custom dataset.The implications of this research extend to optimizing low-bitrate video streaming scenarios.By selectively enhancing the resolution of critical regions in videos,our solution enables high-quality video delivery under constrained bandwidth conditions.Empirical results demonstrate a 15%reduction in transmission bandwidth compared to traditional super-resolution based compression methods,without any perceivable decline in visual quality.This work thus contributes to the advancement of video compression and enhancement technologies,offering an effective strategy for improving digital media delivery efficiency and user experience,especially in bandwidth-limited environments.The innovative integration of ROI identification and super-resolution presents promising avenues for future research and development in adaptive and intelligent video communication systems.展开更多
A layered compression algorithm is presented which delivers spatial scalable encoded bit streams for remote video monitoring system. The complexity of the algorithm is modest and is well suited to real time implementa...A layered compression algorithm is presented which delivers spatial scalable encoded bit streams for remote video monitoring system. The complexity of the algorithm is modest and is well suited to real time implementation. Based on the layered compression algorithm, a codec system model is established. High-speed video compression can be realized with parallel data compression in this codec system. For image reconstruction, a prediction method using two nearest pix points is presented.展开更多
The high-efficiency video coder(HEVC)is one of the most advanced techniques used in growing real-time multimedia applications today.However,they require large bandwidth for transmission through bandwidth,and bandwidth...The high-efficiency video coder(HEVC)is one of the most advanced techniques used in growing real-time multimedia applications today.However,they require large bandwidth for transmission through bandwidth,and bandwidth varies with different video sequences/formats.This paper proposes an adaptive information-based variable quantization matrix(AIVQM)developed for different video formats having variable energy levels.The quantization method is adapted based on video sequence using statistical analysis,improving bit budget,quality and complexity reduction.Further,to have precise control over bit rate and quality,a multi-constraint prune algorithm is proposed in the second stage of the AI-VQM technique for pre-calculating K numbers of paths.The same should be handy to selfadapt and choose one of the K-path automatically in dynamically changing bandwidth availability as per requirement after extensive testing of the proposed algorithm in the multi-constraint environment for multiple paths and evaluating the performance based on peak signal to noise ratio(PSNR),bit-budget and time complexity for different videos a noticeable improvement in rate-distortion(RD)performance is achieved.Using the proposed AIVQM technique,more feasible and efficient video sequences are achieved with less loss in PSNR than the variable quantization method(VQM)algorithm with approximately a rise of 10%–20%based on different video sequences/formats.展开更多
To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advan...To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.展开更多
In this paper, we present a method using video codec technology to compress ECG signals. This method exploits both intra-beat and inter-beat correlations of the ECG signals to achieve high compression ratios (CR) and ...In this paper, we present a method using video codec technology to compress ECG signals. This method exploits both intra-beat and inter-beat correlations of the ECG signals to achieve high compression ratios (CR) and a low percent root mean square difference (PRD). Since ECG signals have both intra-beat and inter-beat redundancies like video signals, which have both intra-frame and inter-frame correlation, video codec technology can be used for ECG compression. In order to do this, some pre-process will be needed. The ECG signals should firstly be segmented and normalized to a sequence of beat cycles with the same length, and then these beat cycles can be treated as picture frames and compressed with video codec technology. We have used records from MIT-BIH arrhythmia database to evaluate our algorithm. Results show that, besides compression efficiently, this algorithm has the advantages of resolution adjustable, random access and flexibility for irregular period and QRS false detection.展开更多
In this paper, we summarize 3D perception-oriented algorithms for perceptually driven 3D video coding. Several perceptual ef- fects have been exploited for 2D video viewing; however, this is not yet the case for 3D vi...In this paper, we summarize 3D perception-oriented algorithms for perceptually driven 3D video coding. Several perceptual ef- fects have been exploited for 2D video viewing; however, this is not yet the case for 3D video viewing. 3D video requires depth perception, which implies binocular effects such as con fl icts, fusion, and rivalry. A better understanding of these effects is necessary for 3D perceptual compression, which provides users with a more comfortable visual experience for video that is de- livered over a channel with limited bandwidth. We present state-of-the-art of 3D visual attention models, 3D just-notice- able difference models, and 3D texture-synthesis models that address 3D human vision issues in 3D video coding and trans-mission.展开更多
BACKGROUND It remains unclear whether video aids can improve the quality of bystander cardiopulmonary resuscitation(CPR).AIM To summarize simulation-based studies aiming at improving bystander CPR associated with the ...BACKGROUND It remains unclear whether video aids can improve the quality of bystander cardiopulmonary resuscitation(CPR).AIM To summarize simulation-based studies aiming at improving bystander CPR associated with the quality of chest compression and time-related quality parameters.METHODS The systematic review was conducted according to the PRISMA guidelines.All relevant studies were searched through PubMed,EMBASE,Medline and Cochrane Library databases.The risk of bias was evaluated using the Cochrane collaboration tool.RESULTS A total of 259 studies were eligible for inclusion,and 6 randomised controlled trial studies were ultimately included.The results of meta-analysis indicated that video-assisted CPR(V-CPR)was significantly associated with the improved mean chest compression rate[OR=0.66(0.49-0.82),P<0.001],and the proportion of chest compression with correct hand positioning[OR=1.63(0.71-2.55),P<0.001].However,the difference in mean chest compression depth was not statistically significant[OR=0.18(-0.07-0.42),P=0.15],and V-CPR was not associated with the time to first chest compression compared to telecommunicator CPR[OR=-0.12(-0.88-0.63),P=0.75].CONCLUSION Video real-time guidance by the dispatcher can improve the quality of bystander CPR to a certain extent.However,the quality is still not ideal,and there is a lack of guidance caused by poor video signal or inadequate interaction.展开更多
In this paper, a new mesh based algorithm is applied for motion estimation and compensation in the wavelet domain. The first major contribution of this work is the introduction of a new active mesh based method for mo...In this paper, a new mesh based algorithm is applied for motion estimation and compensation in the wavelet domain. The first major contribution of this work is the introduction of a new active mesh based method for motion estimation and compensation. The proposed algorithm is based on the mesh energy minimization with novel sets of energy functions. The proposed energy functions have appropriate features, which improve the accuracy of motion estimation and compensation algorithm. We employ the proposed motion estimation algorithm in two different manners for video compression. In the first approach, the proposed algorithm is employed for motion estimation of consecutive frames. In the second approach, the algorithm is applied for motion estimation and compensation in the wavelet sub-bands. The experimental results reveal that the incorporation of active mesh based motion-compensated temporal filtering into wavelet sub-bands significantly improves the distortion performance rate of the video compression. We also use a new wavelet coder for the coding of the 3D volume of coefficients based on the retained energy criteria. This coder gives the maximum retained energy in all sub-bands. The proposed algorithm was tested with some video sequences and the results showed that the use of the proposed active mesh method for motion compensation and its implementation in sub-bands yields significant improvement in PSNR performance.展开更多
The particle image velocimetry (PIV) method was used to investigate the full-field displacements and strains of the limestone specimen under external loads from the video images captured during the laboratory tests.Th...The particle image velocimetry (PIV) method was used to investigate the full-field displacements and strains of the limestone specimen under external loads from the video images captured during the laboratory tests.The original colorful video images and experimental data were obtained from the uniaxial compression test of a limestone.To eliminate perspective errors and lens distortion,the camera was placed normal to the rock specimen exposure.After converted into a readable format of frame images,these videos were transformed into the responding grayscale images,and the frame images were then extracted.The full-field displacement field was obtained by using the PIV technique,and interpolated in the sub-pixel locations.The displacement was measured in the plane of the image and inferred from two consecutive images.The local displacement vectors were calculated for small sub-windows of the images by means of cross-correlation.The video images were interrogated in a multi-pass way,starting off with 64×64 images,ending with 16×16 images after 6 iterations,and using 75% overlap of the sub-windows.In order to remove spurious vectors,the displacements were filtered using four filters:signal-to-noise ratio filter,peak height filter,global filter and local filter.The cubic interpolation was utilized if the displacements without a number were encountered.The full-field strain was then obtained using the local least square method from the discrete displacements.The strain change with time at different locations was also investigated.It is found that the normal strains are dependant on the locations and the crack distributions.Between 1.0 and 5.0 s prior to the specimen failure,normal strains increase rapidly at many locations,while a stable status appears at some locations.When the specimen is in a failure status,a large rotation occurs and it increases in the inverse direction.The strain concentration bands do not completely develop into the large cracks,and meso-cracks are not visible in some bands.The techniques presented here may improve the traditional measurement of the strain field,and may provide a lot of valuable information in investigating the deformation/failure mechanism of rock materials.展开更多
An edge oriented image sequence coding scheme is presented. On the basis of edge detecting, an image could be divided into the sensitized region and the smooth region. In this scheme, the architecture of sensitized r...An edge oriented image sequence coding scheme is presented. On the basis of edge detecting, an image could be divided into the sensitized region and the smooth region. In this scheme, the architecture of sensitized region is approximated with linear type of segments. Then a rectangle belt is constructed for each segment. Finally, the gray value distribution in the region is fitted by normal forms polynomials. The model matching and motion analysis are also based on the architecture of sensitized region. For the smooth region we use the run length scanning and linear approximating. By means of normal forms polynomial fitting and motion prediction by matching, the images are compressed. It is shown through the simulations that the subjective quality of reconstructed picture is excellent at 0.0075 bit per pel.展开更多
Although compressive measurements save data storage and bandwidth usage, they are difficult to be used directly for target tracking and classification without pixel reconstruction. This is because the Gaussian random ...Although compressive measurements save data storage and bandwidth usage, they are difficult to be used directly for target tracking and classification without pixel reconstruction. This is because the Gaussian random matrix destroys the target location information in the original video frames. This paper summarizes our research effort on target tracking and classification directly in the compressive measurement domain. We focus on one particular type of compressive measurement using pixel subsampling. That is, original pixels in video frames are randomly subsampled. Even in such a special compressive sensing setting, conventional trackers do not work in a satisfactory manner. We propose a deep learning approach that integrates YOLO (You Only Look Once) and ResNet (residual network) for multiple target tracking and classification. YOLO is used for multiple target tracking and ResNet is for target classification. Extensive experiments using short wave infrared (SWIR), mid-wave infrared (MWIR), and long-wave infrared (LWIR) videos demonstrated the efficacy of the proposed approach even though the training data are very scarce.展开更多
Super-Resolution (SR) technique means to reconstruct High-Resolution (HR) images from a sequence of Low-Resolution (LR) observations,which has been a great focus for compressed video. Based on the theory of Projection...Super-Resolution (SR) technique means to reconstruct High-Resolution (HR) images from a sequence of Low-Resolution (LR) observations,which has been a great focus for compressed video. Based on the theory of Projection Onto Convex Set (POCS),this paper constructs Quantization Constraint Set (QCS) using the quantization information extracted from the video bit stream. By combining the statistical properties of image and the Human Visual System (HVS),a novel Adaptive Quantization Constraint Set (AQCS) is proposed. Simulation results show that AQCS-based SR al-gorithm converges at a fast rate and obtains better performance in both objective and subjective quality,which is applicable for compressed video.展开更多
A number of automated video shot boundary detection methods for indexing a videosequence to facilitate browsing and retrieval have been proposed in recent years.Among these methods,the dissolve shot boundary isn't...A number of automated video shot boundary detection methods for indexing a videosequence to facilitate browsing and retrieval have been proposed in recent years.Among these methods,the dissolve shot boundary isn't accurately detected because it involves the camera operation and objectmovement.In this paper,a method based on support vector machine (SVM) is proposed to detect thedissolve shot boundary in MPEG compressed sequence.The problem of detection between the dissolveshot boundary and other boundaries is considered as two-class classification in our method.Featuresfrom the compressed sequences are directly extracted without decoding them,and the optimal classboundary between two classes are learned from training data by using SVM.Experiments,whichcompare various classification methods,show that using proposed method encourages performance ofvideo shot boundary detection.展开更多
This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based o...This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based on camera motion analysis. A new algorithm for fast camera motion estimation in compressed domain is presented. In the retrieval process, camera-motion-based semantic retrieval is built. To improve the coverage of the proposed scheme, close-up retrieval in all kinds of videos is investigated. Extensive experiments illustrate that the proposed scheme provides promising retrieval results under real-time and automatic application scenario.展开更多
This letter proposes a novel method of compressed video super-resolution reconstruction based on MAP-POCS (Maximum Posterior Probability-Projection Onto Convex Set). At first assuming the high-resolution model subject...This letter proposes a novel method of compressed video super-resolution reconstruction based on MAP-POCS (Maximum Posterior Probability-Projection Onto Convex Set). At first assuming the high-resolution model subject to Poisson-Markov distribution, then constructing the projecting convex based on MAP. According to the characteristics of compressed video, two different convexes are constructed based on integrating the inter-frame and intra-frame information in the wavelet-domain. The results of the experiment demonstrate that the new method not only outperforms the traditional algorithms on the aspects of PSNR (Peak Signal-to-Noise Ratio), MSE (Mean Square Error) and reconstruction vision effect, but also has the advantages of rapid convergence and easy extension.展开更多
Extraction of traffic information from image or video sequence is a hot research topic in intelligenttransportation system and computer vision. A real-time traffic information extraction method based on com-pressed vi...Extraction of traffic information from image or video sequence is a hot research topic in intelligenttransportation system and computer vision. A real-time traffic information extraction method based on com-pressed video with interframe motion vectors for speed, density and flow detection, has been proposed for ex-traction of traffic information under fixed camera setting and well-defined environment. The motion vectors arefirst separated from the compressed video streams, and then filtered to eliminate incorrect and noisy vectors u-sing the well-defined environmental knowledge. By applying the projective transform and using the filtered mo-tion vectors, speed can be calculated from motion vector statistics, density can be estimated using the motionvector occupancy, and flow can be detected using the combination of speed and density. The embodiment of aprototype system for sky camera traffic monitoring using the MPEG video has been implemented, and experi-mental results proved the effectiveness of the method proposed.展开更多
Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc.,...Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc., are studied in compressed domain. A skin detection method based on data-mining in compressed domain is proposed firstly and achieves the higher detection accuracy as well as higher speed. Then, a cascade scheme of pornographic image recognition based on selective decision tree ensemble is proposed in order to improve both the speed and accuracy of recognition. A pornographic video oriented key frame extraction solution in compressed domain and an approach of pornographic video recognition are discussed respectively in the end.展开更多
The two mast cameras, Mastcams, onboard Mars rover Curiosity are multispectral imagers with nine bands in each. Currently, the images are compressed losslessly using JPEG, which can achieve only two to three times of ...The two mast cameras, Mastcams, onboard Mars rover Curiosity are multispectral imagers with nine bands in each. Currently, the images are compressed losslessly using JPEG, which can achieve only two to three times of compression. We present a comparative study of four approaches to compressing multispectral Mastcam images. The first approach is to divide the nine bands into three groups with each group having three bands. Since the multispectral bands have strong correlation, we treat the three groups of images as video frames. We call this approach the Video approach. The second approach is to compress each group separately and we call it the split band (SB) approach. The third one is to apply a two-step approach in which the first step uses principal component analysis (PCA) to compress a nine-band image cube to six bands and a second step compresses the six PCA bands using conventional codecs. The fourth one is to apply PCA only. In addition, we also present subjective and objective assessment results for compressing RGB images because RGB images have been used for stereo and disparity map generation. Five well-known compression codecs, including JPEG, JPEG-2000 (J2K), X264, X265, and Daala in the literature, have been applied and compared in each approach. The performance of different algorithms was assessed using four well-known performance metrics. Two are conventional and another two are known to have good correlation with human perception. Extensive experiments using actual Mastcam images have been performed to demonstrate the various approaches. We observed that perceptually lossless compression can be achieved at 10:1 compression ratio. In particular, the performance gain of the SB approach with Daala is at least 5 dBs in terms peak signal-to-noise ratio (PSNR) at 10:1 compression ratio over that of JPEG. Subjective comparisons also corroborated with the objective metrics in that perceptually lossless compression can be achieved even at 20 to 1 compression.展开更多
基金funded by National Key Research and Development Program of China(No.2022YFC3302103).
文摘High-resolution video transmission requires a substantial amount of bandwidth.In this paper,we present a novel video processing methodology that innovatively integrates region of interest(ROI)identification and super-resolution enhancement.Our method commences with the accurate detection of ROIs within video sequences,followed by the application of advanced super-resolution techniques to these areas,thereby preserving visual quality while economizing on data transmission.To validate and benchmark our approach,we have curated a new gaming dataset tailored to evaluate the effectiveness of ROI-based super-resolution in practical applications.The proposed model architecture leverages the transformer network framework,guided by a carefully designed multi-task loss function,which facilitates concurrent learning and execution of both ROI identification and resolution enhancement tasks.This unified deep learning model exhibits remarkable performance in achieving super-resolution on our custom dataset.The implications of this research extend to optimizing low-bitrate video streaming scenarios.By selectively enhancing the resolution of critical regions in videos,our solution enables high-quality video delivery under constrained bandwidth conditions.Empirical results demonstrate a 15%reduction in transmission bandwidth compared to traditional super-resolution based compression methods,without any perceivable decline in visual quality.This work thus contributes to the advancement of video compression and enhancement technologies,offering an effective strategy for improving digital media delivery efficiency and user experience,especially in bandwidth-limited environments.The innovative integration of ROI identification and super-resolution presents promising avenues for future research and development in adaptive and intelligent video communication systems.
文摘A layered compression algorithm is presented which delivers spatial scalable encoded bit streams for remote video monitoring system. The complexity of the algorithm is modest and is well suited to real time implementation. Based on the layered compression algorithm, a codec system model is established. High-speed video compression can be realized with parallel data compression in this codec system. For image reconstruction, a prediction method using two nearest pix points is presented.
文摘The high-efficiency video coder(HEVC)is one of the most advanced techniques used in growing real-time multimedia applications today.However,they require large bandwidth for transmission through bandwidth,and bandwidth varies with different video sequences/formats.This paper proposes an adaptive information-based variable quantization matrix(AIVQM)developed for different video formats having variable energy levels.The quantization method is adapted based on video sequence using statistical analysis,improving bit budget,quality and complexity reduction.Further,to have precise control over bit rate and quality,a multi-constraint prune algorithm is proposed in the second stage of the AI-VQM technique for pre-calculating K numbers of paths.The same should be handy to selfadapt and choose one of the K-path automatically in dynamically changing bandwidth availability as per requirement after extensive testing of the proposed algorithm in the multi-constraint environment for multiple paths and evaluating the performance based on peak signal to noise ratio(PSNR),bit-budget and time complexity for different videos a noticeable improvement in rate-distortion(RD)performance is achieved.Using the proposed AIVQM technique,more feasible and efficient video sequences are achieved with less loss in PSNR than the variable quantization method(VQM)algorithm with approximately a rise of 10%–20%based on different video sequences/formats.
基金supported by ZTE Industry-University-Institute Cooperation Funds.
文摘To improve the performance of video compression for machine vision analysis tasks,a video coding for machines(VCM)standard working group was established to promote standardization procedures.In this paper,recent advances in video coding for machine standards are presented and comprehensive introductions to the use cases,requirements,evaluation frameworks and corresponding metrics of the VCM standard are given.Then the existing methods are presented,introducing the existing proposals by category and the research progress of the latest VCM conference.Finally,we give conclusions.
文摘In this paper, we present a method using video codec technology to compress ECG signals. This method exploits both intra-beat and inter-beat correlations of the ECG signals to achieve high compression ratios (CR) and a low percent root mean square difference (PRD). Since ECG signals have both intra-beat and inter-beat redundancies like video signals, which have both intra-frame and inter-frame correlation, video codec technology can be used for ECG compression. In order to do this, some pre-process will be needed. The ECG signals should firstly be segmented and normalized to a sequence of beat cycles with the same length, and then these beat cycles can be treated as picture frames and compressed with video codec technology. We have used records from MIT-BIH arrhythmia database to evaluate our algorithm. Results show that, besides compression efficiently, this algorithm has the advantages of resolution adjustable, random access and flexibility for irregular period and QRS false detection.
文摘In this paper, we summarize 3D perception-oriented algorithms for perceptually driven 3D video coding. Several perceptual ef- fects have been exploited for 2D video viewing; however, this is not yet the case for 3D video viewing. 3D video requires depth perception, which implies binocular effects such as con fl icts, fusion, and rivalry. A better understanding of these effects is necessary for 3D perceptual compression, which provides users with a more comfortable visual experience for video that is de- livered over a channel with limited bandwidth. We present state-of-the-art of 3D visual attention models, 3D just-notice- able difference models, and 3D texture-synthesis models that address 3D human vision issues in 3D video coding and trans-mission.
基金Supported by the Fundamental Research Funds for the Central Universities,Northwest Minzu University,Grant No.31920170180.
文摘BACKGROUND It remains unclear whether video aids can improve the quality of bystander cardiopulmonary resuscitation(CPR).AIM To summarize simulation-based studies aiming at improving bystander CPR associated with the quality of chest compression and time-related quality parameters.METHODS The systematic review was conducted according to the PRISMA guidelines.All relevant studies were searched through PubMed,EMBASE,Medline and Cochrane Library databases.The risk of bias was evaluated using the Cochrane collaboration tool.RESULTS A total of 259 studies were eligible for inclusion,and 6 randomised controlled trial studies were ultimately included.The results of meta-analysis indicated that video-assisted CPR(V-CPR)was significantly associated with the improved mean chest compression rate[OR=0.66(0.49-0.82),P<0.001],and the proportion of chest compression with correct hand positioning[OR=1.63(0.71-2.55),P<0.001].However,the difference in mean chest compression depth was not statistically significant[OR=0.18(-0.07-0.42),P=0.15],and V-CPR was not associated with the time to first chest compression compared to telecommunicator CPR[OR=-0.12(-0.88-0.63),P=0.75].CONCLUSION Video real-time guidance by the dispatcher can improve the quality of bystander CPR to a certain extent.However,the quality is still not ideal,and there is a lack of guidance caused by poor video signal or inadequate interaction.
文摘In this paper, a new mesh based algorithm is applied for motion estimation and compensation in the wavelet domain. The first major contribution of this work is the introduction of a new active mesh based method for motion estimation and compensation. The proposed algorithm is based on the mesh energy minimization with novel sets of energy functions. The proposed energy functions have appropriate features, which improve the accuracy of motion estimation and compensation algorithm. We employ the proposed motion estimation algorithm in two different manners for video compression. In the first approach, the proposed algorithm is employed for motion estimation of consecutive frames. In the second approach, the algorithm is applied for motion estimation and compensation in the wavelet sub-bands. The experimental results reveal that the incorporation of active mesh based motion-compensated temporal filtering into wavelet sub-bands significantly improves the distortion performance rate of the video compression. We also use a new wavelet coder for the coding of the 3D volume of coefficients based on the retained energy criteria. This coder gives the maximum retained energy in all sub-bands. The proposed algorithm was tested with some video sequences and the results showed that the use of the proposed active mesh method for motion compensation and its implementation in sub-bands yields significant improvement in PSNR performance.
基金Project(40972191) supported by the National Natural Science Foundation of ChinaProject(09YZ39) supported by the Creative Issue of Shanghai Education Committee,China
文摘The particle image velocimetry (PIV) method was used to investigate the full-field displacements and strains of the limestone specimen under external loads from the video images captured during the laboratory tests.The original colorful video images and experimental data were obtained from the uniaxial compression test of a limestone.To eliminate perspective errors and lens distortion,the camera was placed normal to the rock specimen exposure.After converted into a readable format of frame images,these videos were transformed into the responding grayscale images,and the frame images were then extracted.The full-field displacement field was obtained by using the PIV technique,and interpolated in the sub-pixel locations.The displacement was measured in the plane of the image and inferred from two consecutive images.The local displacement vectors were calculated for small sub-windows of the images by means of cross-correlation.The video images were interrogated in a multi-pass way,starting off with 64×64 images,ending with 16×16 images after 6 iterations,and using 75% overlap of the sub-windows.In order to remove spurious vectors,the displacements were filtered using four filters:signal-to-noise ratio filter,peak height filter,global filter and local filter.The cubic interpolation was utilized if the displacements without a number were encountered.The full-field strain was then obtained using the local least square method from the discrete displacements.The strain change with time at different locations was also investigated.It is found that the normal strains are dependant on the locations and the crack distributions.Between 1.0 and 5.0 s prior to the specimen failure,normal strains increase rapidly at many locations,while a stable status appears at some locations.When the specimen is in a failure status,a large rotation occurs and it increases in the inverse direction.The strain concentration bands do not completely develop into the large cracks,and meso-cracks are not visible in some bands.The techniques presented here may improve the traditional measurement of the strain field,and may provide a lot of valuable information in investigating the deformation/failure mechanism of rock materials.
文摘An edge oriented image sequence coding scheme is presented. On the basis of edge detecting, an image could be divided into the sensitized region and the smooth region. In this scheme, the architecture of sensitized region is approximated with linear type of segments. Then a rectangle belt is constructed for each segment. Finally, the gray value distribution in the region is fitted by normal forms polynomials. The model matching and motion analysis are also based on the architecture of sensitized region. For the smooth region we use the run length scanning and linear approximating. By means of normal forms polynomial fitting and motion prediction by matching, the images are compressed. It is shown through the simulations that the subjective quality of reconstructed picture is excellent at 0.0075 bit per pel.
文摘Although compressive measurements save data storage and bandwidth usage, they are difficult to be used directly for target tracking and classification without pixel reconstruction. This is because the Gaussian random matrix destroys the target location information in the original video frames. This paper summarizes our research effort on target tracking and classification directly in the compressive measurement domain. We focus on one particular type of compressive measurement using pixel subsampling. That is, original pixels in video frames are randomly subsampled. Even in such a special compressive sensing setting, conventional trackers do not work in a satisfactory manner. We propose a deep learning approach that integrates YOLO (You Only Look Once) and ResNet (residual network) for multiple target tracking and classification. YOLO is used for multiple target tracking and ResNet is for target classification. Extensive experiments using short wave infrared (SWIR), mid-wave infrared (MWIR), and long-wave infrared (LWIR) videos demonstrated the efficacy of the proposed approach even though the training data are very scarce.
基金Supported by National Natural Science Foundation of China(61170147) Major Cooperation Project of Production and College in Fujian Province(2012H61010016) Natural Science Foundation of Fujian Province(2013J01234)
基金the Natural Science Foundation of Jiangsu Province (No.BK2004151).
文摘Super-Resolution (SR) technique means to reconstruct High-Resolution (HR) images from a sequence of Low-Resolution (LR) observations,which has been a great focus for compressed video. Based on the theory of Projection Onto Convex Set (POCS),this paper constructs Quantization Constraint Set (QCS) using the quantization information extracted from the video bit stream. By combining the statistical properties of image and the Human Visual System (HVS),a novel Adaptive Quantization Constraint Set (AQCS) is proposed. Simulation results show that AQCS-based SR al-gorithm converges at a fast rate and obtains better performance in both objective and subjective quality,which is applicable for compressed video.
文摘A number of automated video shot boundary detection methods for indexing a videosequence to facilitate browsing and retrieval have been proposed in recent years.Among these methods,the dissolve shot boundary isn't accurately detected because it involves the camera operation and objectmovement.In this paper,a method based on support vector machine (SVM) is proposed to detect thedissolve shot boundary in MPEG compressed sequence.The problem of detection between the dissolveshot boundary and other boundaries is considered as two-class classification in our method.Featuresfrom the compressed sequences are directly extracted without decoding them,and the optimal classboundary between two classes are learned from training data by using SVM.Experiments,whichcompare various classification methods,show that using proposed method encourages performance ofvideo shot boundary detection.
基金This work was supported by European IST FP6 Research Programme as funded for the Integrated Project:LIVE(No.IST-4-027312).
文摘This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based on camera motion analysis. A new algorithm for fast camera motion estimation in compressed domain is presented. In the retrieval process, camera-motion-based semantic retrieval is built. To improve the coverage of the proposed scheme, close-up retrieval in all kinds of videos is investigated. Extensive experiments illustrate that the proposed scheme provides promising retrieval results under real-time and automatic application scenario.
基金Supported by the Natural Science Foundation of Jiangsu Province (No. BK2004151).
文摘This letter proposes a novel method of compressed video super-resolution reconstruction based on MAP-POCS (Maximum Posterior Probability-Projection Onto Convex Set). At first assuming the high-resolution model subject to Poisson-Markov distribution, then constructing the projecting convex based on MAP. According to the characteristics of compressed video, two different convexes are constructed based on integrating the inter-frame and intra-frame information in the wavelet-domain. The results of the experiment demonstrate that the new method not only outperforms the traditional algorithms on the aspects of PSNR (Peak Signal-to-Noise Ratio), MSE (Mean Square Error) and reconstruction vision effect, but also has the advantages of rapid convergence and easy extension.
文摘Extraction of traffic information from image or video sequence is a hot research topic in intelligenttransportation system and computer vision. A real-time traffic information extraction method based on com-pressed video with interframe motion vectors for speed, density and flow detection, has been proposed for ex-traction of traffic information under fixed camera setting and well-defined environment. The motion vectors arefirst separated from the compressed video streams, and then filtered to eliminate incorrect and noisy vectors u-sing the well-defined environmental knowledge. By applying the projective transform and using the filtered mo-tion vectors, speed can be calculated from motion vector statistics, density can be estimated using the motionvector occupancy, and flow can be detected using the combination of speed and density. The embodiment of aprototype system for sky camera traffic monitoring using the MPEG video has been implemented, and experi-mental results proved the effectiveness of the method proposed.
基金Supported by the National Natural Science Foundation of China (No.60772069)863 High-Tech Project (2008AA01A313)
文摘Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc., are studied in compressed domain. A skin detection method based on data-mining in compressed domain is proposed firstly and achieves the higher detection accuracy as well as higher speed. Then, a cascade scheme of pornographic image recognition based on selective decision tree ensemble is proposed in order to improve both the speed and accuracy of recognition. A pornographic video oriented key frame extraction solution in compressed domain and an approach of pornographic video recognition are discussed respectively in the end.
文摘The two mast cameras, Mastcams, onboard Mars rover Curiosity are multispectral imagers with nine bands in each. Currently, the images are compressed losslessly using JPEG, which can achieve only two to three times of compression. We present a comparative study of four approaches to compressing multispectral Mastcam images. The first approach is to divide the nine bands into three groups with each group having three bands. Since the multispectral bands have strong correlation, we treat the three groups of images as video frames. We call this approach the Video approach. The second approach is to compress each group separately and we call it the split band (SB) approach. The third one is to apply a two-step approach in which the first step uses principal component analysis (PCA) to compress a nine-band image cube to six bands and a second step compresses the six PCA bands using conventional codecs. The fourth one is to apply PCA only. In addition, we also present subjective and objective assessment results for compressing RGB images because RGB images have been used for stereo and disparity map generation. Five well-known compression codecs, including JPEG, JPEG-2000 (J2K), X264, X265, and Daala in the literature, have been applied and compared in each approach. The performance of different algorithms was assessed using four well-known performance metrics. Two are conventional and another two are known to have good correlation with human perception. Extensive experiments using actual Mastcam images have been performed to demonstrate the various approaches. We observed that perceptually lossless compression can be achieved at 10:1 compression ratio. In particular, the performance gain of the SB approach with Daala is at least 5 dBs in terms peak signal-to-noise ratio (PSNR) at 10:1 compression ratio over that of JPEG. Subjective comparisons also corroborated with the objective metrics in that perceptually lossless compression can be achieved even at 20 to 1 compression.