Although the scale of the express industry is large, it is difficult toachieve the function of fully intelligent receiving and sending express. In thispaper, the intelligent express delivery system is proposed based o...Although the scale of the express industry is large, it is difficult toachieve the function of fully intelligent receiving and sending express. In thispaper, the intelligent express delivery system is proposed based on the imageand video processing technology of OpenCV, the Faster R-CNN object detectionalgorithm and other technologies. Through the depth camera and electronic scale,it can identify the object category, volume and weight of the items placed on thescale by the sender and store the video of the objects packed into the cabinet. Theoverall framework of the systemwas constructed;key technologies were applied torealize the system;the function of the system was tested. The experimental resultsshow that it achieves the intelligent automation of delivery and delivery throughthe integrated express delivery system of intelligent identification and informationtraceability, which promotes the development of express delivery industry.展开更多
Video steganography plays an important role in secret communication that conceals a secret video in a cover video by perturbing the value of pixels in the cover frames.Imperceptibility is the first and foremost requir...Video steganography plays an important role in secret communication that conceals a secret video in a cover video by perturbing the value of pixels in the cover frames.Imperceptibility is the first and foremost requirement of any steganographic approach.Inspired by the fact that human eyes perceive pixel perturbation differently in different video areas,a novel effective and efficient Deeply‐Recursive Attention Network(DRANet)for video steganography to find suitable areas for information hiding via modelling spatio‐temporal attention is proposed.The DRANet mainly contains two important components,a Non‐Local Self‐Attention(NLSA)block and a Non‐Local Co‐Attention(NLCA)block.Specifically,the NLSA block can select the cover frame areas which are suitable for hiding by computing the correlations among inter‐and intra‐cover frames.The NLCA block aims to effectively produce the enhanced representations of the secret frames to enhance the robustness of the model and alleviate the influence of different areas in the secret video.Furthermore,the DRANet reduces the model parameters by performing similar operations on the different frames within an input video recursively.Experimental results show the proposed DRANet achieves better performance with fewer parameters than the state‐of‐the‐art competitors.展开更多
Important in many different sectors of the industry, the determination of stream velocity has become more and more important due to measurements precision necessity, in order to determine the right production rates, d...Important in many different sectors of the industry, the determination of stream velocity has become more and more important due to measurements precision necessity, in order to determine the right production rates, determine the volumetric production of undesired fluid, establish automated controls based on these measurements avoiding over-flooding or over-production, guaranteeing accurate predictive maintenance, etc. Difficulties being faced have been the determination of the velocity of specific fluids embedded in some others, for example, determining the gas bubbles stream velocity flowing throughout liquid fluid phase. Although different and already applicable methods have been researched and already implemented within the industry, a non-intrusive automated way of providing those stream velocities has its importance, and may have a huge impact in projects budget. Knowing the importance of its determination, this developed script uses a methodology of breaking-down real-time videos media into frame images, analyzing by pixel correlations possible superposition matches for further gas bubbles stream velocity estimation. In raw sense, the script bases itself in functions and procedures already available in MatLab, which can be used for image processing and treatments, allowing the methodology to be implemented. Its accuracy after the running test was of around 97% (ninety-seven percent);the raw source code with comments had almost 3000 (three thousand) characters;and the hardware placed for running the code was an Intel Core Duo 2.13 [Ghz] and 2 [Gb] RAM memory capable workstation. Even showing good results, it could be stated that just the end point correlations were actually getting to the final solution. So that, making use of self-learning functions or neural network, one could surely enhance the capability of the application to be run in real-time without getting exhaust by iterative loops.展开更多
With the growth of digital media data manipulation in today’s era due to the availability of readily handy tampering software,the authenticity of records is at high risk,especially in video.There is a dire need to de...With the growth of digital media data manipulation in today’s era due to the availability of readily handy tampering software,the authenticity of records is at high risk,especially in video.There is a dire need to detect such problem and do the necessary actions.In this work,we propose an approach to detect the interframe video forgery utilizing the deep features obtained from the parallel deep neural network model and thorough analytical computations.The proposed approach only uses the deep features extracted from the CNN model and then applies the conventional mathematical approach to these features to find the forgery in the video.This work calculates the correlation coefficient from the deep features of the adjacent frames rather than calculating directly from the frames.We divide the procedure of forgery detection into two phases–video forgery detection and video forgery classification.In video forgery detection,this approach detect input video is original or tampered.If the video is not original,then the video is checked in the next phase,which is video forgery classification.In the video forgery classification,method review the forged video for insertion forgery,deletion forgery,and also again check for originality.The proposed work is generalized and it is tested on two different datasets.The experimental results of our proposed model show that our approach can detect the forgery with the accuracy of 91%on VIFFD dataset,90%in TDTV dataset and classify the type of forgery–insertion and deletion with the accuracy of 82%on VIFFD dataset,86%on TDTV dataset.This work can helps in the analysis of original and tempered video in various domain.展开更多
During the past decade, feature extraction and knowledge acquisition based on video analysis have been extensively researched and tested on many applications such as closed-circuit television (CCTV) data analysis, l...During the past decade, feature extraction and knowledge acquisition based on video analysis have been extensively researched and tested on many applications such as closed-circuit television (CCTV) data analysis, large-scale public event control, and other daily security monitoring and surveillance operations with various degrees of success. However, since the actual video process is a multi-phased one and encompasses extensive theories and techniques ranging from fundamental image processing, computational geometry and graphics, and machine vision, to advanced artificial intelligence, pattern analysis, and even cognitive science, there are still many important problems to resolve before it can be widely applied. Among them, video event identification and detection are two prominent ones. Comparing with the most popular frame-to-frame processing mode of most of today's approaches and systems, this project reorganizes video data as a 3D volume structure that provides the hybrid spatial and temporal information in a unified space. This paper reports an innovative technique to transform original video frames to 3D volume structures denoted by spatial and temporal features. It then highlights the volume array structure in a so-called "pre-suspicion" mechanism for a later process. The focus of this report is the development of an effective and efficient voxel-based segmentation technique suitable to the volumetric nature of video events and ready for deployment in 3D clustering operations. The paper is concluded with a performance evaluation of the devised technique and discussion on the future work for accelerating the pre-processing of the original video data.展开更多
A novel temporal shape error concealment technique is proposed, which can he used in the context of object-based video coding schemes. In order to reduce the effect of the shape variations of a video object, the curva...A novel temporal shape error concealment technique is proposed, which can he used in the context of object-based video coding schemes. In order to reduce the effect of the shape variations of a video object, the curvature scale space (CSS) technique is adopted to extract features, and then these features are used for boundary matching between the current frame and the previous frame. Because the temporal, spatial and sta- tistical video contour information are all considered, the proposed method can find the optimal matching, which is used to replace the damaged contours. The simulation results show that the proposed algorithm achieves better subjective, objective qualities and higher efficiency than those previously developed methods.展开更多
The multi-armored target tracking(MATT)plays a crucial role in coordinated tracking and strike.The occlusion and insertion among targets and target scale variation is the key problems in MATT.Most stateof-the-art mult...The multi-armored target tracking(MATT)plays a crucial role in coordinated tracking and strike.The occlusion and insertion among targets and target scale variation is the key problems in MATT.Most stateof-the-art multi-object tracking(MOT)works adopt the tracking-by-detection strategy,which rely on compute-intensive sliding window or anchoring scheme in detection module and neglect the target scale variation in tracking module.In this work,we proposed a more efficient and effective spatial-temporal attention scheme to track multi-armored target in the ground battlefield.By simulating the structure of the retina,a novel visual-attention Gabor filter branch is proposed to enhance detection.By introducing temporal information,some online learned target-specific Convolutional Neural Networks(CNNs)are adopted to address occlusion.More importantly,we built a MOT dataset for armored targets,called Armored Target Tracking dataset(ATTD),based on which several comparable experiments with state-ofthe-art methods are conducted.Experimental results show that the proposed method achieves outstanding tracking performance and meets the actual application requirements.展开更多
The advent of the COVID-19 pandemic has adversely affected the entire world and has put forth high demand for techniques that remotely manage crowd-related tasks.Video surveillance and crowd management using video ana...The advent of the COVID-19 pandemic has adversely affected the entire world and has put forth high demand for techniques that remotely manage crowd-related tasks.Video surveillance and crowd management using video analysis techniques have significantly impacted today’s research,and numerous applications have been developed in this domain.This research proposed an anomaly detection technique applied to Umrah videos in Kaaba during the COVID-19 pandemic through sparse crowd analysis.Managing theKaaba rituals is crucial since the crowd gathers from around the world and requires proper analysis during these days of the pandemic.The Umrah videos are analyzed,and a system is devised that can track and monitor the crowd flow in Kaaba.The crowd in these videos is sparse due to the pandemic,and we have developed a technique to track the maximum crowd flow and detect any object(person)moving in the direction unlikely of the major flow.We have detected abnormal movement by creating the histograms for the vertical and horizontal flows and applying thresholds to identify the non-majority flow.Our algorithm aims to analyze the crowd through video surveillance and timely detect any abnormal activity tomaintain a smooth crowd flowinKaaba during the pandemic.展开更多
The Rate Distortion Optimization(RDO)algorithm in High Efficiency Video Coding(HEVC)has many iterations and a large number of calculations.In order to decrease the calculation time and meet the requirements of fast sw...The Rate Distortion Optimization(RDO)algorithm in High Efficiency Video Coding(HEVC)has many iterations and a large number of calculations.In order to decrease the calculation time and meet the requirements of fast switching of RDO algorithms of different scales,an RDO dynamic reconfigurable structure is proposed.First,the Quantization Parameter(QP)and bit rate values were loaded through an H⁃tree Configurable Network(HCN),and the execution status of the array was detected in real time.When the switching request of the RDO algorithm was detected,the corresponding configuration information was delivered.This self⁃reconfiguration implementation method improved the flexibility and utilization of hardware.Experimental results show that when the control bit width was only increased by 31.25%,the designed configuration network could increase the number of controllable processing units by 32 times,and the execution cycle was 50%lower than the same type of design.Compared with previous RDO algorithm,the RDO algorithm implemented on the reconfigurable array based on the configuration network had an average operating frequency increase of 12.5%and an area reduction of 56.4%.展开更多
Developments in neurophysiology focusing on foveal vision have characterized more and more precisely the spatiotemporal processing that is well adapted to the regularization of the visual information within the retina...Developments in neurophysiology focusing on foveal vision have characterized more and more precisely the spatiotemporal processing that is well adapted to the regularization of the visual information within the retina. The works described in this article focus on a simplified architectural model based on features and mechanisms of adaptation in the retina. Similarly to the biological retina, which transforms luminance information into a series of encoded representations of image characteristics transmitted to the brain, our structural model allows us to reveal more information in the scene. Our modeling of the different functional pathways permits the mapping of important complementary information types at abstract levels of image analysis, and thereby allows a better exploitation of visual clues. Our model is based on a distributed cellular automata network and simulates the retinal processing of stimuli that are stationary or in motion. Thanks to its capacity for dynamic adaptation, our model can adapt itself to different scenes (e.g., bright and dim, stationary and moving, etc.) and can parallelize those processing steps that can be supported by parallel calculators.展开更多
We present a system that automatically recovers scene geometry and illumination from a video, providing a basis for various applications. Previous image based illumination estimation methods require either user intera...We present a system that automatically recovers scene geometry and illumination from a video, providing a basis for various applications. Previous image based illumination estimation methods require either user interaction or external information in the form of a database. We adopt structure-from-motion and multi-view stereo for initial scene reconstruction, and then estimate an environment map represented by spherical harmonics (as these perform better than other bases). We also demonstrate several video editing applications that exploit the recovered geometry and illumination, including object insertion (e.g., for augmented reality), shadow detection, and video relighting.展开更多
This paper proposes a new algorithm based on low-rank matrix recovery to remove salt &pepper noise from surveillance video. Unlike single image denoising techniques, noise removal from video sequences aims to util...This paper proposes a new algorithm based on low-rank matrix recovery to remove salt &pepper noise from surveillance video. Unlike single image denoising techniques, noise removal from video sequences aims to utilize both temporal and spatial information. By grouping neighboring frames based on similarities of the whole images in the temporal domain, we formulate the problem of removing salt &pepper noise from a video tracking sequence as a lowrank matrix recovery problem. The resulting nuclear norm and L1-norm related minimization problems can be efficiently solved by many recently developed methods. To determine the low-rank matrix, we use an averaging method based on other similar images. Our method can not only remove noise but also preserve edges and details. The performance of our proposed approach compares favorably to that of existing algorithms and gives better PSNR and SSIM results.展开更多
The side information quality has an immense effect on the compression efficiency of the distributed video coding (DVC) sys- tem. This article, based on the hierarchical motion estimation (HME), proposes a new side inf...The side information quality has an immense effect on the compression efficiency of the distributed video coding (DVC) sys- tem. This article, based on the hierarchical motion estimation (HME), proposes a new side information generation algorithm which is integrated into DVC system. First, forward motion estimation (FME) and bidirectional motion estimation (BME) on the basis of variable block size HME algorithm are used to acquire relatively accurate motion vectors. Second, a motion vector filter (MVF) is i...展开更多
This paper introduces a new method of converting interlaced video to a progressively scanned video and image, The new method is derived from Bayesian framework with the spatial-temporal smoothness constraint and the M...This paper introduces a new method of converting interlaced video to a progressively scanned video and image, The new method is derived from Bayesian framework with the spatial-temporal smoothness constraint and the MAP is done by minimizing the energy functional, The half-quadratic regularization method is used to solve the corresponding partial differential equations (PDEs), This approach gives the improved results over the conventional de-interlacing methods, Two criteria are proposed in the paper, and they can be used to evaluate the performance of the de-interlacing algorithms,展开更多
Areal-time image-tracking algorithm is proposed,which gives small weights to pixels farther from the object center and uses the quantized image gray scales as a template.It identifies the target’s location by the mea...Areal-time image-tracking algorithm is proposed,which gives small weights to pixels farther from the object center and uses the quantized image gray scales as a template.It identifies the target’s location by the mean-shift iteration method and arrives at the target’s scale by using image feature recognition.It improves the kernel-based algorithm in tracking scale-changing targets.A decimation method is proposed to track large-sized targets and real-time experimental results verify the effectiveness of the proposed algorithm.展开更多
In this study, a low complexity frame-rate up conversion method using compressed domain information for H.264 decoder is proposed. In the proposed scheme, the motion vectors (MVs) are estimated using constant accele...In this study, a low complexity frame-rate up conversion method using compressed domain information for H.264 decoder is proposed. In the proposed scheme, the motion vectors (MVs) are estimated using constant acceleration motion model, and the MVs regarded as no credibility are corrected, and the interpolation method is applied on the basis of the macroblock (MB) coded types. Applied to the H.264 decoder, the proposed method provides high quality interpolation frames and an obvious decrease of the block artifacts.展开更多
Based on the property analysis of interferential multispectral images, a novel compression algorithm of partial set partitioning in hierarchical trees (SPIHT) with classified weighted rate-distortion optimization is...Based on the property analysis of interferential multispectral images, a novel compression algorithm of partial set partitioning in hierarchical trees (SPIHT) with classified weighted rate-distortion optimization is presented. After wavelet decomposition, partial SPIHT is applied to each zero tree independently by adaptively selecting one of three coding modes according to the probability of the significant coefficients in each bitplane. Meanwhile the interferential multispectral image is partitioned into two kinds of regions in terms of luminous intensity, and the rate-distortion slopes of zero trees are then lifted with classified weights according to their distortion contribution to the constructed spectrum. Finally a global rate- distortion optimization truncation is performed. Compared with the conventional methods, the proposed algorithm not only improves the performance in spatial domain but also reduces the distortion in spectral domain.展开更多
基金This article is supported by the 2020 Innovation and Entrepreneurship Training Program forCollege Students in Jiangsu Province(Project name:Traceablemulti-functional intelligent express cabinet,No.201911460090P,No.202011460090T)This article is supported by the National Natural Science Foundation of China Youth Science Foundation project(Project name:Research on Deep Discriminant Spares Representation Learning Method for Feature Extraction,No.61806098)This article is supported by Scientific Research Project of Nanjing XiaoZhuang University(Project name:Multi-robot collaborative system,No.2017NXY16).
文摘Although the scale of the express industry is large, it is difficult toachieve the function of fully intelligent receiving and sending express. In thispaper, the intelligent express delivery system is proposed based on the imageand video processing technology of OpenCV, the Faster R-CNN object detectionalgorithm and other technologies. Through the depth camera and electronic scale,it can identify the object category, volume and weight of the items placed on thescale by the sender and store the video of the objects packed into the cabinet. Theoverall framework of the systemwas constructed;key technologies were applied torealize the system;the function of the system was tested. The experimental resultsshow that it achieves the intelligent automation of delivery and delivery throughthe integrated express delivery system of intelligent identification and informationtraceability, which promotes the development of express delivery industry.
基金supported in part by NSFC(62002320,U19B2043,61672456)the Key R&D Program of Zhejiang Province,China(2021C01119).
文摘Video steganography plays an important role in secret communication that conceals a secret video in a cover video by perturbing the value of pixels in the cover frames.Imperceptibility is the first and foremost requirement of any steganographic approach.Inspired by the fact that human eyes perceive pixel perturbation differently in different video areas,a novel effective and efficient Deeply‐Recursive Attention Network(DRANet)for video steganography to find suitable areas for information hiding via modelling spatio‐temporal attention is proposed.The DRANet mainly contains two important components,a Non‐Local Self‐Attention(NLSA)block and a Non‐Local Co‐Attention(NLCA)block.Specifically,the NLSA block can select the cover frame areas which are suitable for hiding by computing the correlations among inter‐and intra‐cover frames.The NLCA block aims to effectively produce the enhanced representations of the secret frames to enhance the robustness of the model and alleviate the influence of different areas in the secret video.Furthermore,the DRANet reduces the model parameters by performing similar operations on the different frames within an input video recursively.Experimental results show the proposed DRANet achieves better performance with fewer parameters than the state‐of‐the‐art competitors.
基金financial support from the Brazilian Federal Agency for Support and Evaluation of Graduate Education(Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior—CAPES,scholarship process no BEX 0506/15-0)the Brazilian National Agency of Petroleum,Natural Gas and Biofuels(Agencia Nacional do Petroleo,Gas Natural e Biocombustiveis—ANP),in cooperation with the Brazilian Financier of Studies and Projects(Financiadora de Estudos e Projetos—FINEP)the Brazilian Ministry of Science,Technology and Innovation(Ministério da Ciencia,Tecnologia e Inovacao—MCTI)through the ANP’s Human Resources Program of the State University of Sao Paulo(Universidade Estadual Paulista—UNESP)for the Oil and Gas Sector PRH-ANP/MCTI no 48(PRH48).
文摘Important in many different sectors of the industry, the determination of stream velocity has become more and more important due to measurements precision necessity, in order to determine the right production rates, determine the volumetric production of undesired fluid, establish automated controls based on these measurements avoiding over-flooding or over-production, guaranteeing accurate predictive maintenance, etc. Difficulties being faced have been the determination of the velocity of specific fluids embedded in some others, for example, determining the gas bubbles stream velocity flowing throughout liquid fluid phase. Although different and already applicable methods have been researched and already implemented within the industry, a non-intrusive automated way of providing those stream velocities has its importance, and may have a huge impact in projects budget. Knowing the importance of its determination, this developed script uses a methodology of breaking-down real-time videos media into frame images, analyzing by pixel correlations possible superposition matches for further gas bubbles stream velocity estimation. In raw sense, the script bases itself in functions and procedures already available in MatLab, which can be used for image processing and treatments, allowing the methodology to be implemented. Its accuracy after the running test was of around 97% (ninety-seven percent);the raw source code with comments had almost 3000 (three thousand) characters;and the hardware placed for running the code was an Intel Core Duo 2.13 [Ghz] and 2 [Gb] RAM memory capable workstation. Even showing good results, it could be stated that just the end point correlations were actually getting to the final solution. So that, making use of self-learning functions or neural network, one could surely enhance the capability of the application to be run in real-time without getting exhaust by iterative loops.
文摘With the growth of digital media data manipulation in today’s era due to the availability of readily handy tampering software,the authenticity of records is at high risk,especially in video.There is a dire need to detect such problem and do the necessary actions.In this work,we propose an approach to detect the interframe video forgery utilizing the deep features obtained from the parallel deep neural network model and thorough analytical computations.The proposed approach only uses the deep features extracted from the CNN model and then applies the conventional mathematical approach to these features to find the forgery in the video.This work calculates the correlation coefficient from the deep features of the adjacent frames rather than calculating directly from the frames.We divide the procedure of forgery detection into two phases–video forgery detection and video forgery classification.In video forgery detection,this approach detect input video is original or tampered.If the video is not original,then the video is checked in the next phase,which is video forgery classification.In the video forgery classification,method review the forged video for insertion forgery,deletion forgery,and also again check for originality.The proposed work is generalized and it is tested on two different datasets.The experimental results of our proposed model show that our approach can detect the forgery with the accuracy of 91%on VIFFD dataset,90%in TDTV dataset and classify the type of forgery–insertion and deletion with the accuracy of 82%on VIFFD dataset,86%on TDTV dataset.This work can helps in the analysis of original and tempered video in various domain.
文摘During the past decade, feature extraction and knowledge acquisition based on video analysis have been extensively researched and tested on many applications such as closed-circuit television (CCTV) data analysis, large-scale public event control, and other daily security monitoring and surveillance operations with various degrees of success. However, since the actual video process is a multi-phased one and encompasses extensive theories and techniques ranging from fundamental image processing, computational geometry and graphics, and machine vision, to advanced artificial intelligence, pattern analysis, and even cognitive science, there are still many important problems to resolve before it can be widely applied. Among them, video event identification and detection are two prominent ones. Comparing with the most popular frame-to-frame processing mode of most of today's approaches and systems, this project reorganizes video data as a 3D volume structure that provides the hybrid spatial and temporal information in a unified space. This paper reports an innovative technique to transform original video frames to 3D volume structures denoted by spatial and temporal features. It then highlights the volume array structure in a so-called "pre-suspicion" mechanism for a later process. The focus of this report is the development of an effective and efficient voxel-based segmentation technique suitable to the volumetric nature of video events and ready for deployment in 3D clustering operations. The paper is concluded with a performance evaluation of the devised technique and discussion on the future work for accelerating the pre-processing of the original video data.
基金the National Natural Science Foundation of China (60532070)
文摘A novel temporal shape error concealment technique is proposed, which can he used in the context of object-based video coding schemes. In order to reduce the effect of the shape variations of a video object, the curvature scale space (CSS) technique is adopted to extract features, and then these features are used for boundary matching between the current frame and the previous frame. Because the temporal, spatial and sta- tistical video contour information are all considered, the proposed method can find the optimal matching, which is used to replace the damaged contours. The simulation results show that the proposed algorithm achieves better subjective, objective qualities and higher efficiency than those previously developed methods.
基金This work was supported by the National Key Research and Development Program of China(No.2016YFC0802904)National Natural Science Foundation of China(No.61671470)+1 种基金Natural Science Foundation of Jiangsu Province(BK20161470)62nd batch of funded projects of China Postdoctoral Science Foundation(No.2017M623423).
文摘The multi-armored target tracking(MATT)plays a crucial role in coordinated tracking and strike.The occlusion and insertion among targets and target scale variation is the key problems in MATT.Most stateof-the-art multi-object tracking(MOT)works adopt the tracking-by-detection strategy,which rely on compute-intensive sliding window or anchoring scheme in detection module and neglect the target scale variation in tracking module.In this work,we proposed a more efficient and effective spatial-temporal attention scheme to track multi-armored target in the ground battlefield.By simulating the structure of the retina,a novel visual-attention Gabor filter branch is proposed to enhance detection.By introducing temporal information,some online learned target-specific Convolutional Neural Networks(CNNs)are adopted to address occlusion.More importantly,we built a MOT dataset for armored targets,called Armored Target Tracking dataset(ATTD),based on which several comparable experiments with state-ofthe-art methods are conducted.Experimental results show that the proposed method achieves outstanding tracking performance and meets the actual application requirements.
基金The authors extend their appreciation to the Deputyship for Research and Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number QURDO001Project title:Intelligent Real-Time Crowd Monitoring System Using Unmanned Aerial Vehicle(UAV)Video and Global Positioning Systems(GPS)Data。
文摘The advent of the COVID-19 pandemic has adversely affected the entire world and has put forth high demand for techniques that remotely manage crowd-related tasks.Video surveillance and crowd management using video analysis techniques have significantly impacted today’s research,and numerous applications have been developed in this domain.This research proposed an anomaly detection technique applied to Umrah videos in Kaaba during the COVID-19 pandemic through sparse crowd analysis.Managing theKaaba rituals is crucial since the crowd gathers from around the world and requires proper analysis during these days of the pandemic.The Umrah videos are analyzed,and a system is devised that can track and monitor the crowd flow in Kaaba.The crowd in these videos is sparse due to the pandemic,and we have developed a technique to track the maximum crowd flow and detect any object(person)moving in the direction unlikely of the major flow.We have detected abnormal movement by creating the histograms for the vertical and horizontal flows and applying thresholds to identify the non-majority flow.Our algorithm aims to analyze the crowd through video surveillance and timely detect any abnormal activity tomaintain a smooth crowd flowinKaaba during the pandemic.
基金Sponsored by the National Natural Science Foundation of China(Grant Nos.61834005,61772417,61802304,61602377,and 61634004)the Shaanxi Province Coordination Innovation Project of Science and Technology(Grant No.2016KTZDGY02-04-02)+1 种基金the Shaanxi Provincial Key R&D Plan(Grant No.2017GY-060)the Shaanxi International Science and Technology Cooperation Program(Grant No.2018KW-006).
文摘The Rate Distortion Optimization(RDO)algorithm in High Efficiency Video Coding(HEVC)has many iterations and a large number of calculations.In order to decrease the calculation time and meet the requirements of fast switching of RDO algorithms of different scales,an RDO dynamic reconfigurable structure is proposed.First,the Quantization Parameter(QP)and bit rate values were loaded through an H⁃tree Configurable Network(HCN),and the execution status of the array was detected in real time.When the switching request of the RDO algorithm was detected,the corresponding configuration information was delivered.This self⁃reconfiguration implementation method improved the flexibility and utilization of hardware.Experimental results show that when the control bit width was only increased by 31.25%,the designed configuration network could increase the number of controllable processing units by 32 times,and the execution cycle was 50%lower than the same type of design.Compared with previous RDO algorithm,the RDO algorithm implemented on the reconfigurable array based on the configuration network had an average operating frequency increase of 12.5%and an area reduction of 56.4%.
文摘Developments in neurophysiology focusing on foveal vision have characterized more and more precisely the spatiotemporal processing that is well adapted to the regularization of the visual information within the retina. The works described in this article focus on a simplified architectural model based on features and mechanisms of adaptation in the retina. Similarly to the biological retina, which transforms luminance information into a series of encoded representations of image characteristics transmitted to the brain, our structural model allows us to reveal more information in the scene. Our modeling of the different functional pathways permits the mapping of important complementary information types at abstract levels of image analysis, and thereby allows a better exploitation of visual clues. Our model is based on a distributed cellular automata network and simulates the retinal processing of stimuli that are stationary or in motion. Thanks to its capacity for dynamic adaptation, our model can adapt itself to different scenes (e.g., bright and dim, stationary and moving, etc.) and can parallelize those processing steps that can be supported by parallel calculators.
基金This work was supported by the National Natural Science Foundation of China (NSFC) and the Israel Science Foundation (ISF), Joint NSFC-ISF Research Program under Grant No. 61561146393, the National Natural Science Foundation of China under Grant No. 61521002, a research grant from the Beijing Higher Institution Engineering Research Center, and the Tsinghua-Tencent Joint Laboratory for Internet Innovation Technology.
文摘We present a system that automatically recovers scene geometry and illumination from a video, providing a basis for various applications. Previous image based illumination estimation methods require either user interaction or external information in the form of a database. We adopt structure-from-motion and multi-view stereo for initial scene reconstruction, and then estimate an environment map represented by spherical harmonics (as these perform better than other bases). We also demonstrate several video editing applications that exploit the recovered geometry and illumination, including object insertion (e.g., for augmented reality), shadow detection, and video relighting.
基金supported by the National Nature Science Foundation of China (Nos. 61332015, 61373078, 61272245, and 61272430)NSFC Joint Fund with Guangdong (No. U1201258)
文摘This paper proposes a new algorithm based on low-rank matrix recovery to remove salt &pepper noise from surveillance video. Unlike single image denoising techniques, noise removal from video sequences aims to utilize both temporal and spatial information. By grouping neighboring frames based on similarities of the whole images in the temporal domain, we formulate the problem of removing salt &pepper noise from a video tracking sequence as a lowrank matrix recovery problem. The resulting nuclear norm and L1-norm related minimization problems can be efficiently solved by many recently developed methods. To determine the low-rank matrix, we use an averaging method based on other similar images. Our method can not only remove noise but also preserve edges and details. The performance of our proposed approach compares favorably to that of existing algorithms and gives better PSNR and SSIM results.
基金National Natural Science Foundation of China (60702012)
文摘The side information quality has an immense effect on the compression efficiency of the distributed video coding (DVC) sys- tem. This article, based on the hierarchical motion estimation (HME), proposes a new side information generation algorithm which is integrated into DVC system. First, forward motion estimation (FME) and bidirectional motion estimation (BME) on the basis of variable block size HME algorithm are used to acquire relatively accurate motion vectors. Second, a motion vector filter (MVF) is i...
文摘This paper introduces a new method of converting interlaced video to a progressively scanned video and image, The new method is derived from Bayesian framework with the spatial-temporal smoothness constraint and the MAP is done by minimizing the energy functional, The half-quadratic regularization method is used to solve the corresponding partial differential equations (PDEs), This approach gives the improved results over the conventional de-interlacing methods, Two criteria are proposed in the paper, and they can be used to evaluate the performance of the de-interlacing algorithms,
基金supported by the National Natural Science Foundation of China(Grant No.60572023).
文摘Areal-time image-tracking algorithm is proposed,which gives small weights to pixels farther from the object center and uses the quantized image gray scales as a template.It identifies the target’s location by the mean-shift iteration method and arrives at the target’s scale by using image feature recognition.It improves the kernel-based algorithm in tracking scale-changing targets.A decimation method is proposed to track large-sized targets and real-time experimental results verify the effectiveness of the proposed algorithm.
文摘In this study, a low complexity frame-rate up conversion method using compressed domain information for H.264 decoder is proposed. In the proposed scheme, the motion vectors (MVs) are estimated using constant acceleration motion model, and the MVs regarded as no credibility are corrected, and the interpolation method is applied on the basis of the macroblock (MB) coded types. Applied to the H.264 decoder, the proposed method provides high quality interpolation frames and an obvious decrease of the block artifacts.
基金the National Natural Sci-ence Foundation of China (No.60532060 and 60507012)the Innovation Foundation of Xidian University (Chuang 05025).
文摘Based on the property analysis of interferential multispectral images, a novel compression algorithm of partial set partitioning in hierarchical trees (SPIHT) with classified weighted rate-distortion optimization is presented. After wavelet decomposition, partial SPIHT is applied to each zero tree independently by adaptively selecting one of three coding modes according to the probability of the significant coefficients in each bitplane. Meanwhile the interferential multispectral image is partitioned into two kinds of regions in terms of luminous intensity, and the rate-distortion slopes of zero trees are then lifted with classified weights according to their distortion contribution to the constructed spectrum. Finally a global rate- distortion optimization truncation is performed. Compared with the conventional methods, the proposed algorithm not only improves the performance in spatial domain but also reduces the distortion in spectral domain.