A two-stage automatic key frame selection method is proposed to enhance stitching speed and quality for UAV aerial videos. In the first stage, to reduce redundancy, the overlapping rate of the UAV aerial video sequenc...A two-stage automatic key frame selection method is proposed to enhance stitching speed and quality for UAV aerial videos. In the first stage, to reduce redundancy, the overlapping rate of the UAV aerial video sequence within the sampling period is calculated. Lagrange interpolation is used to fit the overlapping rate curve of the sequence. An empirical threshold for the overlapping rate is then applied to filter candidate key frames from the sequence. In the second stage, the principle of minimizing remapping spots is used to dynamically adjust and determine the final key frame close to the candidate key frames. Comparative experiments show that the proposed method significantly improves stitching speed and accuracy by more than 40%.展开更多
A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key f...A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key frames representing the simple activities were extracted by the self-splitting competitive learning(SSCL) algorithm. A new similarity criterion of complex activities was defined. Besides the regular visual factor, the order factor and the interference factor measuring the timing matching relationship of the simple activities and the discontinuous matching relationship of the simple activities respectively were considered. On these bases, the complex human activity recognition could be achieved by calculating their similarities. The recognition error was reduced compared with other methods when ignoring the recognition of simple activities. The proposed method was tested and evaluated on the self-built broadcast gymnastic database and the dancing database. The experimental results prove the superior efficiency.展开更多
In digital video analysis, browse, retrieval and query, shot is incapable of meeting needs. Scene is a cluster of a series of shots, which partially meets above demands. In this paper, an algorithm of video scenes clu...In digital video analysis, browse, retrieval and query, shot is incapable of meeting needs. Scene is a cluster of a series of shots, which partially meets above demands. In this paper, an algorithm of video scenes clustering based on shot key frame sets is proposed. We use X^2 histogram match and twin histogram comparison for shot detection. A method is presented for key frame set extraction based on distance of non adjacent frames, further more, the minimum distance of key frame sets as distance of shots is computed, eventually scenes are clustered according to the distance of shots. Experiments of this algorithm show satisfactory performance in cor rectness and computing speed.展开更多
Key frame extraction based on sparse coding can reduce the redundancy of continuous frames and concisely express the entire video.However,how to develop a key frame extraction algorithm that can automatically extract ...Key frame extraction based on sparse coding can reduce the redundancy of continuous frames and concisely express the entire video.However,how to develop a key frame extraction algorithm that can automatically extract a few frames with a low reconstruction error remains a challenge.In this paper,we propose a novel model of structured sparse-codingbased key frame extraction,wherein a nonconvex group log-regularizer is used with strong sparsity and a low reconstruction error.To automatically extract key frames,a decomposition scheme is designed to separate the sparse coefficient matrix by rows.The rows enforced by the nonconvex group log-regularizer become zero or nonzero,leading to the learning of the structured sparse coefficient matrix.To solve the nonconvex problems due to the log-regularizer,the difference of convex algorithm(DCA)is employed to decompose the log-regularizer into the difference of two convex functions related to the l1 norm,which can be directly obtained through the proximal operator.Therefore,an efficient structured sparse coding algorithm with the group log-regularizer for key frame extraction is developed,which can automatically extract a few frames directly from the video to represent the entire video with a low reconstruction error.Experimental results demonstrate that the proposed algorithm can extract more accurate key frames from most Sum Me videos compared to the stateof-the-art methods.Furthermore,the proposed algorithm can obtain a higher compression with a nearly 18% increase compared to sparse modeling representation selection(SMRS)and an 8% increase compared to SC-det on the VSUMM dataset.展开更多
With the vigorous development of national infrastructure construction and public information construction,video surveillance systems have gradually penetrated various fields.The current key frame extraction technology...With the vigorous development of national infrastructure construction and public information construction,video surveillance systems have gradually penetrated various fields.The current key frame extraction technology has inadequate target details and inaccurate judgment of local actions.Addressing this problem,a key frame extraction method based on fractional Fourier transform is proposed.This method obtained the phase spectra information of different orders by performing fractional Fourier transform on the surveillance video frames.Next,the method designed an adaptive algorithm based on the golden section point to select the transformation order.Then,the phase spectrum information of two adjacent frames was used to characterize the changes in the global and local motion states of the target.The final step was to extract key frames based on this.Experimental results show that,compared with the previous methods,the key frames extracted by the method proposed in this paper can correctly capture the changes in the global and local motion states of the target.展开更多
With the explosive growth of surveillance video data,browsing videos quickly and effectively has become an urgent problem.Video key frame extraction has received widespread attention as an effective solution.However,a...With the explosive growth of surveillance video data,browsing videos quickly and effectively has become an urgent problem.Video key frame extraction has received widespread attention as an effective solution.However,accurately capturing the local motion state changes of moving objects in the video is still challenging in key frame extraction.The target center offset can reflect the change of its motion state.This observation proposed a novel key frame extraction method based on moving objects center offset in this paper.The proposed method utilizes the center offset to obtain the global and local motion state information of moving objects,and meanwhile,selects the video frame where the center offset curve changes suddenly as the key frame.Such processing effectively overcomes the inaccuracy of traditional key frame extraction methods.Initially,extracting the center point of each frame.Subsequently,calculating the center point offset of each frame and forming the center offset curve by connecting the center offset of each frame.Finally,extracting candidate key frames and optimizing them to generate final key frames.The experimental results demonstrate that the proposed method outperforms contrast methods to capturing the local motion state changes of moving objects.展开更多
In order toovercomethe poor local search ability of genetic algorithm, resulting in the basic genetic algorithm is time-consuming, and low search abilityin the late evolutionary, we use thegray coding instead ofbinary...In order toovercomethe poor local search ability of genetic algorithm, resulting in the basic genetic algorithm is time-consuming, and low search abilityin the late evolutionary, we use thegray coding instead ofbinary codingatthebeginning of the coding;we use multi-point crossoverto replace the originalsingle-point crossoveroperation.Finally, theexperimentshows that the improved genetic algorithmnot only has a strong search capability, but also thestability has been effectively improved.展开更多
This paper proposes a novel algorithm for extracting key frames to represent video shots. Re- garding whether, or how well, a key frame represents a shot, different interpretations have been suggested. We develop ou...This paper proposes a novel algorithm for extracting key frames to represent video shots. Re- garding whether, or how well, a key frame represents a shot, different interpretations have been suggested. We develop our algorithm on the assumption that more important content may demand more attention and may last relatively more frames. Unsupervised clustering is used to divide the frames into clusters within a shot, and then a key frame is selected from each candidate cluster. To make the algorithm independent of video sequences, we employ a statistical model to calculate the clustering threshold. The proposed algo- rithm can capture the important yet salient content as the key frame. Its robustness and adaptability are validated by experiments with various kinds of video sequences.展开更多
In video information retrieval, key frame extraction has been rec ognized as one of the important research issues. Although much progress has been made, the existing approaches are either computationally expensive or ...In video information retrieval, key frame extraction has been rec ognized as one of the important research issues. Although much progress has been made, the existing approaches are either computationally expensive or ineffective in capturing salient visual content. In this paper, we first discuss the importance of key frame extraction and then briefly review and evaluate the existing approaches. To overcome the shortcomings of the existing approaches, we introduce a new algorithm for key frame extraction based on unsupervised clustering. Meanwhile, we provide a feedback chain to adjust the granularity of the extraction result. The proposed algorithm is both computationally simple and able to capture the visual content.The efficiency and effectiveness are validated by large amount of real-world videos.展开更多
Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc.,...Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc., are studied in compressed domain. A skin detection method based on data-mining in compressed domain is proposed firstly and achieves the higher detection accuracy as well as higher speed. Then, a cascade scheme of pornographic image recognition based on selective decision tree ensemble is proposed in order to improve both the speed and accuracy of recognition. A pornographic video oriented key frame extraction solution in compressed domain and an approach of pornographic video recognition are discussed respectively in the end.展开更多
This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveill...This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveillance videos on demand through video streaming over mobile communication networks. The intelligent video analysis includes moving object detection/tracking and key frame selection which can browse useful video clips. The communication networking services, comprising video transcoding, multimedia messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction and particle filter tracking. Key frame selection, which aims to deliver an alarm to a mobile client using multimedia messaging service accompanied with an extracted clear frame, is reached by devising a weighted importance criterion considering object clarity and face appearance. Besides, a spatial- domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into the mobile video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, choose very appropriate key frames for users, and transcode the images with a high power signal-to-noise ratio (PSNR).展开更多
Traditional cartoons have been widely used in entertainment,education,and advertisement.Thus,a large amount of cartoon data is available.In this paper,we propose a new technique for capturing the motion of a character...Traditional cartoons have been widely used in entertainment,education,and advertisement.Thus,a large amount of cartoon data is available.In this paper,we propose a new technique for capturing the motion of a character in an existing cartoon sequence.This technique tracks the contours of the cartoon character in the sequence,and key frames are used to guide the tracking.We model contour tracking as a space-time optimization problem in which an energy function including both temporal and spatial constraints is defined.First,the user labels the contours of the character on the key frames.Then,the contours on the intermediate frames are tracked by minimizing the energy function.The user may need to interactively adjust the tracking result and restart the optimization process to refine the result.Finally,an edge snapping algorithm is applied to make the tracking result more precise.Experiments show that our technique works effectively.展开更多
Nowadays, it is quite easy to read or write kanji, that is Chinese character, by. writingthem on the computer monitor tube and/or word processor on screen. On the other hand, their realmeanings or history as they have...Nowadays, it is quite easy to read or write kanji, that is Chinese character, by. writingthem on the computer monitor tube and/or word processor on screen. On the other hand, their realmeanings or history as they have originated and developed can not always be understood by theJapanese people.Moreover, any kanji today is often used only as a symbol without other meaning, andmeaningless words otherwise than their pronunciations are prevalent;which trend trend should never besupported by the public in general.In this study, we propose a system of educational tools for presenting kanji characters to let thestudents understand meanings of them as they originated from their prototypes or hieroglyphicimages representing their original meanings. The key frame or interval figure method is one ofeffective methods in computer graphic (CG) to show the transition of one original figure (A) to itscurrent form (B). Using this method is considered very effective when kanji is usually written ordisplayed in straight lines and curves. However, sometimes and kanji is also written or drawn like apicture with a special pen or a Japanese fude () to make adequate a main auxiliary techniquecalled morphing introduced in this study. Several examples in point are demonstrated on the videotape at the conference.展开更多
Developments in multimedia technologies have paved way for the storage of huge collections of video doc- uments on computer systems. It is essential to design tools for content-based access to the documents, so as to ...Developments in multimedia technologies have paved way for the storage of huge collections of video doc- uments on computer systems. It is essential to design tools for content-based access to the documents, so as to allow an efficient exploitation of these collections. Content based anal- ysis provides a flexible and powerful way to access video data when compared with the other traditional video analysis tech- niques. The area of content based video indexing and retrieval (CBVIR), focusing on automating the indexing, retrieval and management of video, has attracted extensive research in the last decade. CBVIR is a lively area of research with endur- ing acknowledgments from several domains. Herein a vital assessment of contemporary researches associated with the content-based indexing and retrieval of visual information. In this paper, we present an extensive review of significant researches on CBV1R. Concise description of content based video analysis along with the techniques associated with the content based video indexing and retrieval is presented.展开更多
文摘A two-stage automatic key frame selection method is proposed to enhance stitching speed and quality for UAV aerial videos. In the first stage, to reduce redundancy, the overlapping rate of the UAV aerial video sequence within the sampling period is calculated. Lagrange interpolation is used to fit the overlapping rate curve of the sequence. An empirical threshold for the overlapping rate is then applied to filter candidate key frames from the sequence. In the second stage, the principle of minimizing remapping spots is used to dynamically adjust and determine the final key frame close to the candidate key frames. Comparative experiments show that the proposed method significantly improves stitching speed and accuracy by more than 40%.
基金Project(50808025) supported by the National Natural Science Foundation of ChinaProject(20090162110057) supported by the Doctoral Fund of Ministry of Education,China
文摘A new method for complex activity recognition in videos by key frames was presented. The progressive bisection strategy(PBS) was employed to divide the complex activity into a series of simple activities and the key frames representing the simple activities were extracted by the self-splitting competitive learning(SSCL) algorithm. A new similarity criterion of complex activities was defined. Besides the regular visual factor, the order factor and the interference factor measuring the timing matching relationship of the simple activities and the discontinuous matching relationship of the simple activities respectively were considered. On these bases, the complex human activity recognition could be achieved by calculating their similarities. The recognition error was reduced compared with other methods when ignoring the recognition of simple activities. The proposed method was tested and evaluated on the self-built broadcast gymnastic database and the dancing database. The experimental results prove the superior efficiency.
基金Supported by the Natural Science Foundation ofHubei Province(2004ABA174)
文摘In digital video analysis, browse, retrieval and query, shot is incapable of meeting needs. Scene is a cluster of a series of shots, which partially meets above demands. In this paper, an algorithm of video scenes clustering based on shot key frame sets is proposed. We use X^2 histogram match and twin histogram comparison for shot detection. A method is presented for key frame set extraction based on distance of non adjacent frames, further more, the minimum distance of key frame sets as distance of shots is computed, eventually scenes are clustered according to the distance of shots. Experiments of this algorithm show satisfactory performance in cor rectness and computing speed.
基金supported in part by the National Natural Science Foundation of China(61903090,61727810,62073086,62076077,61803096,U191140003)the Guangzhou Science and Technology Program Project(202002030289)Japan Society for the Promotion of Science(JSPS)KAKENHI(18K18083)。
文摘Key frame extraction based on sparse coding can reduce the redundancy of continuous frames and concisely express the entire video.However,how to develop a key frame extraction algorithm that can automatically extract a few frames with a low reconstruction error remains a challenge.In this paper,we propose a novel model of structured sparse-codingbased key frame extraction,wherein a nonconvex group log-regularizer is used with strong sparsity and a low reconstruction error.To automatically extract key frames,a decomposition scheme is designed to separate the sparse coefficient matrix by rows.The rows enforced by the nonconvex group log-regularizer become zero or nonzero,leading to the learning of the structured sparse coefficient matrix.To solve the nonconvex problems due to the log-regularizer,the difference of convex algorithm(DCA)is employed to decompose the log-regularizer into the difference of two convex functions related to the l1 norm,which can be directly obtained through the proximal operator.Therefore,an efficient structured sparse coding algorithm with the group log-regularizer for key frame extraction is developed,which can automatically extract a few frames directly from the video to represent the entire video with a low reconstruction error.Experimental results demonstrate that the proposed algorithm can extract more accurate key frames from most Sum Me videos compared to the stateof-the-art methods.Furthermore,the proposed algorithm can obtain a higher compression with a nearly 18% increase compared to sparse modeling representation selection(SMRS)and an 8% increase compared to SC-det on the VSUMM dataset.
基金supported by the National Natural Science Foundation of China(Nos.61702347,62027801 and 61972267)Natural Science Fund of Hebei Province(No.F2017210161),and Hebei Province Graduate Student Innovation Ability Training Funding Project.
文摘With the vigorous development of national infrastructure construction and public information construction,video surveillance systems have gradually penetrated various fields.The current key frame extraction technology has inadequate target details and inaccurate judgment of local actions.Addressing this problem,a key frame extraction method based on fractional Fourier transform is proposed.This method obtained the phase spectra information of different orders by performing fractional Fourier transform on the surveillance video frames.Next,the method designed an adaptive algorithm based on the golden section point to select the transformation order.Then,the phase spectrum information of two adjacent frames was used to characterize the changes in the global and local motion states of the target.The final step was to extract key frames based on this.Experimental results show that,compared with the previous methods,the key frames extracted by the method proposed in this paper can correctly capture the changes in the global and local motion states of the target.
基金This work was supported by the National Nature Science Foundation of China(Grant No.61702347,61772225)Natural Science Foundation of Hebei Province(Grant No.F2017210161).
文摘With the explosive growth of surveillance video data,browsing videos quickly and effectively has become an urgent problem.Video key frame extraction has received widespread attention as an effective solution.However,accurately capturing the local motion state changes of moving objects in the video is still challenging in key frame extraction.The target center offset can reflect the change of its motion state.This observation proposed a novel key frame extraction method based on moving objects center offset in this paper.The proposed method utilizes the center offset to obtain the global and local motion state information of moving objects,and meanwhile,selects the video frame where the center offset curve changes suddenly as the key frame.Such processing effectively overcomes the inaccuracy of traditional key frame extraction methods.Initially,extracting the center point of each frame.Subsequently,calculating the center point offset of each frame and forming the center offset curve by connecting the center offset of each frame.Finally,extracting candidate key frames and optimizing them to generate final key frames.The experimental results demonstrate that the proposed method outperforms contrast methods to capturing the local motion state changes of moving objects.
文摘In order toovercomethe poor local search ability of genetic algorithm, resulting in the basic genetic algorithm is time-consuming, and low search abilityin the late evolutionary, we use thegray coding instead ofbinary codingatthebeginning of the coding;we use multi-point crossoverto replace the originalsingle-point crossoveroperation.Finally, theexperimentshows that the improved genetic algorithmnot only has a strong search capability, but also thestability has been effectively improved.
基金Supported by the National Natural Science Foundation of China(No. 60072009)
文摘This paper proposes a novel algorithm for extracting key frames to represent video shots. Re- garding whether, or how well, a key frame represents a shot, different interpretations have been suggested. We develop our algorithm on the assumption that more important content may demand more attention and may last relatively more frames. Unsupervised clustering is used to divide the frames into clusters within a shot, and then a key frame is selected from each candidate cluster. To make the algorithm independent of video sequences, we employ a statistical model to calculate the clustering threshold. The proposed algo- rithm can capture the important yet salient content as the key frame. Its robustness and adaptability are validated by experiments with various kinds of video sequences.
文摘In video information retrieval, key frame extraction has been rec ognized as one of the important research issues. Although much progress has been made, the existing approaches are either computationally expensive or ineffective in capturing salient visual content. In this paper, we first discuss the importance of key frame extraction and then briefly review and evaluate the existing approaches. To overcome the shortcomings of the existing approaches, we introduce a new algorithm for key frame extraction based on unsupervised clustering. Meanwhile, we provide a feedback chain to adjust the granularity of the extraction result. The proposed algorithm is both computationally simple and able to capture the visual content.The efficiency and effectiveness are validated by large amount of real-world videos.
基金Supported by the National Natural Science Foundation of China (No.60772069)863 High-Tech Project (2008AA01A313)
文摘Pornographic image/video recognition plays a vital role in network information surveillance and management. In this paper, its key techniques, such as skin detection, key frame extraction, and classifier design, etc., are studied in compressed domain. A skin detection method based on data-mining in compressed domain is proposed firstly and achieves the higher detection accuracy as well as higher speed. Then, a cascade scheme of pornographic image recognition based on selective decision tree ensemble is proposed in order to improve both the speed and accuracy of recognition. A pornographic video oriented key frame extraction solution in compressed domain and an approach of pornographic video recognition are discussed respectively in the end.
文摘This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveillance videos on demand through video streaming over mobile communication networks. The intelligent video analysis includes moving object detection/tracking and key frame selection which can browse useful video clips. The communication networking services, comprising video transcoding, multimedia messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction and particle filter tracking. Key frame selection, which aims to deliver an alarm to a mobile client using multimedia messaging service accompanied with an extracted clear frame, is reached by devising a weighted importance criterion considering object clarity and face appearance. Besides, a spatial- domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into the mobile video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, choose very appropriate key frames for users, and transcode the images with a high power signal-to-noise ratio (PSNR).
基金Project supported by the National Natural Science Foundation of China (No.60903134)the National Key Technology R & D Program of China (No.2007BAH11B00)the Fundamental Research Funds for the Central Universities of China (No.2009QNA5021)
文摘Traditional cartoons have been widely used in entertainment,education,and advertisement.Thus,a large amount of cartoon data is available.In this paper,we propose a new technique for capturing the motion of a character in an existing cartoon sequence.This technique tracks the contours of the cartoon character in the sequence,and key frames are used to guide the tracking.We model contour tracking as a space-time optimization problem in which an energy function including both temporal and spatial constraints is defined.First,the user labels the contours of the character on the key frames.Then,the contours on the intermediate frames are tracked by minimizing the energy function.The user may need to interactively adjust the tracking result and restart the optimization process to refine the result.Finally,an edge snapping algorithm is applied to make the tracking result more precise.Experiments show that our technique works effectively.
文摘Nowadays, it is quite easy to read or write kanji, that is Chinese character, by. writingthem on the computer monitor tube and/or word processor on screen. On the other hand, their realmeanings or history as they have originated and developed can not always be understood by theJapanese people.Moreover, any kanji today is often used only as a symbol without other meaning, andmeaningless words otherwise than their pronunciations are prevalent;which trend trend should never besupported by the public in general.In this study, we propose a system of educational tools for presenting kanji characters to let thestudents understand meanings of them as they originated from their prototypes or hieroglyphicimages representing their original meanings. The key frame or interval figure method is one ofeffective methods in computer graphic (CG) to show the transition of one original figure (A) to itscurrent form (B). Using this method is considered very effective when kanji is usually written ordisplayed in straight lines and curves. However, sometimes and kanji is also written or drawn like apicture with a special pen or a Japanese fude () to make adequate a main auxiliary techniquecalled morphing introduced in this study. Several examples in point are demonstrated on the videotape at the conference.
文摘Developments in multimedia technologies have paved way for the storage of huge collections of video doc- uments on computer systems. It is essential to design tools for content-based access to the documents, so as to allow an efficient exploitation of these collections. Content based anal- ysis provides a flexible and powerful way to access video data when compared with the other traditional video analysis tech- niques. The area of content based video indexing and retrieval (CBVIR), focusing on automating the indexing, retrieval and management of video, has attracted extensive research in the last decade. CBVIR is a lively area of research with endur- ing acknowledgments from several domains. Herein a vital assessment of contemporary researches associated with the content-based indexing and retrieval of visual information. In this paper, we present an extensive review of significant researches on CBV1R. Concise description of content based video analysis along with the techniques associated with the content based video indexing and retrieval is presented.