Key frame extraction based on sparse coding can reduce the redundancy of continuous frames and concisely express the entire video.However,how to develop a key frame extraction algorithm that can automatically extract ...Key frame extraction based on sparse coding can reduce the redundancy of continuous frames and concisely express the entire video.However,how to develop a key frame extraction algorithm that can automatically extract a few frames with a low reconstruction error remains a challenge.In this paper,we propose a novel model of structured sparse-codingbased key frame extraction,wherein a nonconvex group log-regularizer is used with strong sparsity and a low reconstruction error.To automatically extract key frames,a decomposition scheme is designed to separate the sparse coefficient matrix by rows.The rows enforced by the nonconvex group log-regularizer become zero or nonzero,leading to the learning of the structured sparse coefficient matrix.To solve the nonconvex problems due to the log-regularizer,the difference of convex algorithm(DCA)is employed to decompose the log-regularizer into the difference of two convex functions related to the l1 norm,which can be directly obtained through the proximal operator.Therefore,an efficient structured sparse coding algorithm with the group log-regularizer for key frame extraction is developed,which can automatically extract a few frames directly from the video to represent the entire video with a low reconstruction error.Experimental results demonstrate that the proposed algorithm can extract more accurate key frames from most Sum Me videos compared to the stateof-the-art methods.Furthermore,the proposed algorithm can obtain a higher compression with a nearly 18% increase compared to sparse modeling representation selection(SMRS)and an 8% increase compared to SC-det on the VSUMM dataset.展开更多
With the explosive growth of surveillance video data,browsing videos quickly and effectively has become an urgent problem.Video key frame extraction has received widespread attention as an effective solution.However,a...With the explosive growth of surveillance video data,browsing videos quickly and effectively has become an urgent problem.Video key frame extraction has received widespread attention as an effective solution.However,accurately capturing the local motion state changes of moving objects in the video is still challenging in key frame extraction.The target center offset can reflect the change of its motion state.This observation proposed a novel key frame extraction method based on moving objects center offset in this paper.The proposed method utilizes the center offset to obtain the global and local motion state information of moving objects,and meanwhile,selects the video frame where the center offset curve changes suddenly as the key frame.Such processing effectively overcomes the inaccuracy of traditional key frame extraction methods.Initially,extracting the center point of each frame.Subsequently,calculating the center point offset of each frame and forming the center offset curve by connecting the center offset of each frame.Finally,extracting candidate key frames and optimizing them to generate final key frames.The experimental results demonstrate that the proposed method outperforms contrast methods to capturing the local motion state changes of moving objects.展开更多
With the vigorous development of national infrastructure construction and public information construction,video surveillance systems have gradually penetrated various fields.The current key frame extraction technology...With the vigorous development of national infrastructure construction and public information construction,video surveillance systems have gradually penetrated various fields.The current key frame extraction technology has inadequate target details and inaccurate judgment of local actions.Addressing this problem,a key frame extraction method based on fractional Fourier transform is proposed.This method obtained the phase spectra information of different orders by performing fractional Fourier transform on the surveillance video frames.Next,the method designed an adaptive algorithm based on the golden section point to select the transformation order.Then,the phase spectrum information of two adjacent frames was used to characterize the changes in the global and local motion states of the target.The final step was to extract key frames based on this.Experimental results show that,compared with the previous methods,the key frames extracted by the method proposed in this paper can correctly capture the changes in the global and local motion states of the target.展开更多
In video information retrieval, key frame extraction has been rec ognized as one of the important research issues. Although much progress has been made, the existing approaches are either computationally expensive or ...In video information retrieval, key frame extraction has been rec ognized as one of the important research issues. Although much progress has been made, the existing approaches are either computationally expensive or ineffective in capturing salient visual content. In this paper, we first discuss the importance of key frame extraction and then briefly review and evaluate the existing approaches. To overcome the shortcomings of the existing approaches, we introduce a new algorithm for key frame extraction based on unsupervised clustering. Meanwhile, we provide a feedback chain to adjust the granularity of the extraction result. The proposed algorithm is both computationally simple and able to capture the visual content.The efficiency and effectiveness are validated by large amount of real-world videos.展开更多
A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key f...A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key frames representing the simple activities were extracted by the self-splitting competitive learning(SSCL) algorithm. A new similarity criterion of complex activities was defined. Besides the regular visual factor, the order factor and the interference factor measuring the timing matching relationship of the simple activities and the discontinuous matching relationship of the simple activities respectively were considered. On these bases, the complex human activity recognition could be achieved by calculating their similarities. The recognition error was reduced compared with other methods when ignoring the recognition of simple activities. The proposed method was tested and evaluated on the self-built broadcast gymnastic database and the dancing database. The experimental results prove the superior efficiency.展开更多
In order toovercomethe poor local search ability of genetic algorithm, resulting in the basic genetic algorithm is time-consuming, and low search abilityin the late evolutionary, we use thegray coding instead ofbinary...In order toovercomethe poor local search ability of genetic algorithm, resulting in the basic genetic algorithm is time-consuming, and low search abilityin the late evolutionary, we use thegray coding instead ofbinary codingatthebeginning of the coding;we use multi-point crossoverto replace the originalsingle-point crossoveroperation.Finally, theexperimentshows that the improved genetic algorithmnot only has a strong search capability, but also thestability has been effectively improved.展开更多
Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc.,...Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc., are studied in compressed domain. A skin detection method based on data-mining in compressed domain is proposed firstly and achieves the higher detection accuracy as well as higher speed. Then, a cascade scheme of pornographic image recognition based on selective decision tree ensemble is proposed in order to improve both the speed and accuracy of recognition. A pornographic video oriented key frame extraction solution in compressed domain and an approach of pornographic video recognition are discussed respectively in the end.展开更多
This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveill...This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveillance videos on demand through video streaming over mobile communication networks. The intelligent video analysis includes moving object detection/tracking and key frame selection which can browse useful video clips. The communication networking services, comprising video transcoding, multimedia messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction and particle filter tracking. Key frame selection, which aims to deliver an alarm to a mobile client using multimedia messaging service accompanied with an extracted clear frame, is reached by devising a weighted importance criterion considering object clarity and face appearance. Besides, a spatial- domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into the mobile video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, choose very appropriate key frames for users, and transcode the images with a high power signal-to-noise ratio (PSNR).展开更多
Developments in multimedia technologies have paved way for the storage of huge collections of video doc- uments on computer systems. It is essential to design tools for content-based access to the documents, so as to ...Developments in multimedia technologies have paved way for the storage of huge collections of video doc- uments on computer systems. It is essential to design tools for content-based access to the documents, so as to allow an efficient exploitation of these collections. Content based anal- ysis provides a flexible and powerful way to access video data when compared with the other traditional video analysis tech- niques. The area of content based video indexing and retrieval (CBVIR), focusing on automating the indexing, retrieval and management of video, has attracted extensive research in the last decade. CBVIR is a lively area of research with endur- ing acknowledgments from several domains. Herein a vital assessment of contemporary researches associated with the content-based indexing and retrieval of visual information. In this paper, we present an extensive review of significant researches on CBV1R. Concise description of content based video analysis along with the techniques associated with the content based video indexing and retrieval is presented.展开更多
基金supported in part by the National Natural Science Foundation of China(61903090,61727810,62073086,62076077,61803096,U191140003)the Guangzhou Science and Technology Program Project(202002030289)Japan Society for the Promotion of Science(JSPS)KAKENHI(18K18083)。
文摘Key frame extraction based on sparse coding can reduce the redundancy of continuous frames and concisely express the entire video.However,how to develop a key frame extraction algorithm that can automatically extract a few frames with a low reconstruction error remains a challenge.In this paper,we propose a novel model of structured sparse-codingbased key frame extraction,wherein a nonconvex group log-regularizer is used with strong sparsity and a low reconstruction error.To automatically extract key frames,a decomposition scheme is designed to separate the sparse coefficient matrix by rows.The rows enforced by the nonconvex group log-regularizer become zero or nonzero,leading to the learning of the structured sparse coefficient matrix.To solve the nonconvex problems due to the log-regularizer,the difference of convex algorithm(DCA)is employed to decompose the log-regularizer into the difference of two convex functions related to the l1 norm,which can be directly obtained through the proximal operator.Therefore,an efficient structured sparse coding algorithm with the group log-regularizer for key frame extraction is developed,which can automatically extract a few frames directly from the video to represent the entire video with a low reconstruction error.Experimental results demonstrate that the proposed algorithm can extract more accurate key frames from most Sum Me videos compared to the stateof-the-art methods.Furthermore,the proposed algorithm can obtain a higher compression with a nearly 18% increase compared to sparse modeling representation selection(SMRS)and an 8% increase compared to SC-det on the VSUMM dataset.
基金This work was supported by the National Nature Science Foundation of China(Grant No.61702347,61772225)Natural Science Foundation of Hebei Province(Grant No.F2017210161).
文摘With the explosive growth of surveillance video data,browsing videos quickly and effectively has become an urgent problem.Video key frame extraction has received widespread attention as an effective solution.However,accurately capturing the local motion state changes of moving objects in the video is still challenging in key frame extraction.The target center offset can reflect the change of its motion state.This observation proposed a novel key frame extraction method based on moving objects center offset in this paper.The proposed method utilizes the center offset to obtain the global and local motion state information of moving objects,and meanwhile,selects the video frame where the center offset curve changes suddenly as the key frame.Such processing effectively overcomes the inaccuracy of traditional key frame extraction methods.Initially,extracting the center point of each frame.Subsequently,calculating the center point offset of each frame and forming the center offset curve by connecting the center offset of each frame.Finally,extracting candidate key frames and optimizing them to generate final key frames.The experimental results demonstrate that the proposed method outperforms contrast methods to capturing the local motion state changes of moving objects.
基金supported by the National Natural Science Foundation of China(Nos.61702347,62027801 and 61972267)Natural Science Fund of Hebei Province(No.F2017210161),and Hebei Province Graduate Student Innovation Ability Training Funding Project.
文摘With the vigorous development of national infrastructure construction and public information construction,video surveillance systems have gradually penetrated various fields.The current key frame extraction technology has inadequate target details and inaccurate judgment of local actions.Addressing this problem,a key frame extraction method based on fractional Fourier transform is proposed.This method obtained the phase spectra information of different orders by performing fractional Fourier transform on the surveillance video frames.Next,the method designed an adaptive algorithm based on the golden section point to select the transformation order.Then,the phase spectrum information of two adjacent frames was used to characterize the changes in the global and local motion states of the target.The final step was to extract key frames based on this.Experimental results show that,compared with the previous methods,the key frames extracted by the method proposed in this paper can correctly capture the changes in the global and local motion states of the target.
文摘In video information retrieval, key frame extraction has been rec ognized as one of the important research issues. Although much progress has been made, the existing approaches are either computationally expensive or ineffective in capturing salient visual content. In this paper, we first discuss the importance of key frame extraction and then briefly review and evaluate the existing approaches. To overcome the shortcomings of the existing approaches, we introduce a new algorithm for key frame extraction based on unsupervised clustering. Meanwhile, we provide a feedback chain to adjust the granularity of the extraction result. The proposed algorithm is both computationally simple and able to capture the visual content.The efficiency and effectiveness are validated by large amount of real-world videos.
基金Project(50808025) supported by the National Natural Science Foundation of ChinaProject(20090162110057) supported by the Doctoral Fund of Ministry of Education,China
文摘A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key frames representing the simple activities were extracted by the self-splitting competitive learning(SSCL) algorithm. A new similarity criterion of complex activities was defined. Besides the regular visual factor, the order factor and the interference factor measuring the timing matching relationship of the simple activities and the discontinuous matching relationship of the simple activities respectively were considered. On these bases, the complex human activity recognition could be achieved by calculating their similarities. The recognition error was reduced compared with other methods when ignoring the recognition of simple activities. The proposed method was tested and evaluated on the self-built broadcast gymnastic database and the dancing database. The experimental results prove the superior efficiency.
文摘In order toovercomethe poor local search ability of genetic algorithm, resulting in the basic genetic algorithm is time-consuming, and low search abilityin the late evolutionary, we use thegray coding instead ofbinary codingatthebeginning of the coding;we use multi-point crossoverto replace the originalsingle-point crossoveroperation.Finally, theexperimentshows that the improved genetic algorithmnot only has a strong search capability, but also thestability has been effectively improved.
基金Supported by the National Natural Science Foundation of China (No.60772069)863 High-Tech Project (2008AA01A313)
文摘Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc., are studied in compressed domain. A skin detection method based on data-mining in compressed domain is proposed firstly and achieves the higher detection accuracy as well as higher speed. Then, a cascade scheme of pornographic image recognition based on selective decision tree ensemble is proposed in order to improve both the speed and accuracy of recognition. A pornographic video oriented key frame extraction solution in compressed domain and an approach of pornographic video recognition are discussed respectively in the end.
文摘This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveillance videos on demand through video streaming over mobile communication networks. The intelligent video analysis includes moving object detection/tracking and key frame selection which can browse useful video clips. The communication networking services, comprising video transcoding, multimedia messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction and particle filter tracking. Key frame selection, which aims to deliver an alarm to a mobile client using multimedia messaging service accompanied with an extracted clear frame, is reached by devising a weighted importance criterion considering object clarity and face appearance. Besides, a spatial- domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into the mobile video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, choose very appropriate key frames for users, and transcode the images with a high power signal-to-noise ratio (PSNR).
文摘Developments in multimedia technologies have paved way for the storage of huge collections of video doc- uments on computer systems. It is essential to design tools for content-based access to the documents, so as to allow an efficient exploitation of these collections. Content based anal- ysis provides a flexible and powerful way to access video data when compared with the other traditional video analysis tech- niques. The area of content based video indexing and retrieval (CBVIR), focusing on automating the indexing, retrieval and management of video, has attracted extensive research in the last decade. CBVIR is a lively area of research with endur- ing acknowledgments from several domains. Herein a vital assessment of contemporary researches associated with the content-based indexing and retrieval of visual information. In this paper, we present an extensive review of significant researches on CBV1R. Concise description of content based video analysis along with the techniques associated with the content based video indexing and retrieval is presented.