A number of automated video shot boundary detection methods for indexing a videosequence to facilitate browsing and retrieval have been proposed in recent years.Among these methods,the dissolve shot boundary isn't...A number of automated video shot boundary detection methods for indexing a videosequence to facilitate browsing and retrieval have been proposed in recent years.Among these methods,the dissolve shot boundary isn't accurately detected because it involves the camera operation and objectmovement.In this paper,a method based on support vector machine (SVM) is proposed to detect thedissolve shot boundary in MPEG compressed sequence.The problem of detection between the dissolveshot boundary and other boundaries is considered as two-class classification in our method.Featuresfrom the compressed sequences are directly extracted without decoding them,and the optimal classboundary between two classes are learned from training data by using SVM.Experiments,whichcompare various classification methods,show that using proposed method encourages performance ofvideo shot boundary detection.展开更多
An effective approach, mapping the texture for building model based on the digital photogrammetric theory, is proposed. The easily-acquired image sequences from digital video camera on helicopter are used as texture r...An effective approach, mapping the texture for building model based on the digital photogrammetric theory, is proposed. The easily-acquired image sequences from digital video camera on helicopter are used as texture resource, and the correspondence between the space edge in building geometry model and its line feature in image sequences is determined semi-automatically. The experimental results in production of three-dimensional data for car navigation show us an attractive future both in efficiency and effect.展开更多
Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse...Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse facial features of individual frames.In this paper, a frame-level attention module is integrated into an improved VGG-based frame work and a lightweight facial expression recognition method is proposed.The proposed network takes a sub video cut from an experimental video sequence as its input and generates a fixed-dimension representation.The VGG-based network with an enhanced branch embeds face images into feature vectors.The frame-level attention module learns weights which are used to adaptively aggregate the feature vectors to form a single discriminative video representation.Finally, a regression module outputs the classification results.The experimental results on CK+and AFEW databases show that the recognition rates of the proposed method can achieve the state-of-the-art performance.展开更多
The devastating effects of wildland fire are an unsolved problem,resulting in human losses and the destruction of natural and economic resources.Convolutional neural network(CNN)is shown to perform very well in the ar...The devastating effects of wildland fire are an unsolved problem,resulting in human losses and the destruction of natural and economic resources.Convolutional neural network(CNN)is shown to perform very well in the area of object classification.This network has the ability to perform feature extraction and classification within the same architecture.In this paper,we propose a CNN for identifying fire in videos.A deep domain based method for video fire detection is proposed to extract a powerful feature representation of fire.Testing on real video sequences,the proposed approach achieves better classification performance as some of relevant conventional video based fire detection methods and indicates that using CNN to detect fire in videos is efficient.To balance the efficiency and accuracy,the model is fine-tuned considering the nature of the target problem and fire data.Experimental results on benchmark fire datasets reveal the effectiveness of the proposed framework and validate its suitability for fire detection in closed-circuit television surveillance systems compared to state-of-the-art methods.展开更多
A distortion identification technique is presented based on Hilbert-Huang transform to identify distortion model and distortion frequency of distorted real-world image sequences. The distortion model is identified sim...A distortion identification technique is presented based on Hilbert-Huang transform to identify distortion model and distortion frequency of distorted real-world image sequences. The distortion model is identified simply based on Hilbert marginal spectral analysis after empirical mode decomposing. And distortion frequency is identified by analyzing the occurrence frequency of instantaneous frequency components of every intrinsic mode functions. Rational digital frequency filter with suitable cutoff frequency is designed to remove undesired fluctuations based on identification results. Experimental results show that this technique can identify distortion model and distortion frequency of displacement sequence accurately and efficiently. Based on identification results, distorted image sequence can be stabilized effectively.展开更多
The main purpose of the model is to present how the Unified Modeling Language (UML) can be used for modeling digital video database system (VDBS). It demonstrates the modeling process that can be followed during the a...The main purpose of the model is to present how the Unified Modeling Language (UML) can be used for modeling digital video database system (VDBS). It demonstrates the modeling process that can be followed during the analysis phase of complex applications. In order to guarantee the continuity mapping of the models, the authors propose some suggestions to transform the use case diagrams into an object diagram, which is one of the main diagrams for the next development phases.展开更多
Segmentation of semantic Video Object Planes (VOP's) from video sequence is a key to the standard MPEG-4 with content-based video coding. In this paper, the approach of automatic Segmentation of VOP's Based on...Segmentation of semantic Video Object Planes (VOP's) from video sequence is a key to the standard MPEG-4 with content-based video coding. In this paper, the approach of automatic Segmentation of VOP's Based on Spatio-Temporal Information (SBSTI) is proposed.The proceeding results demonstrate the good performance of the algorithm.展开更多
In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are ...In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are extracted by unsupervised learning. For this learning, we chose the order-aware competitive learning. After obtaining a set of exemplars for each video, the similarity is computed. Because the numbers and positions of the exemplars are different in each video, we use a similarity computing method called M-distance, which generalizes existing global and local alignment methods using followers to the exemplars. To represent each frame in the video, this paper emphasizes the Frame Signature of the ISO/IEC standard so that the total system, along with its graphical user interface, becomes practical. Experiments on the detection of inserted plagiaristic scenes showed excellent precision-recall curves, with precision values very close to 1. Thus, the proposed system can work as a plagiarism detector for videos. In addition, this method can be regarded as the structuring of unstructured data via numerical labeling by exemplars. Finally, further sophistication of this labeling is discussed.展开更多
Objective:Explore the feasibility of the high precision accelerometer for measuring the human respiratory displacement.Methods:A wireless acceleration acquisition system with the low power consumption and the high pre...Objective:Explore the feasibility of the high precision accelerometer for measuring the human respiratory displacement.Methods:A wireless acceleration acquisition system with the low power consumption and the high precision was designed with the high precision acceleration sensor ADXL355 as the core device.Based on the frequency characteristics of the breathing motion and the principle that the displacement can be calculated by the acceleration quadratic integration,two displacement measurement algorithms for the quasi-periodic weak motion are designed.Results:The simulation results show that the proposed algorithm is effective.The experimental results show that the designed acquisition system and algorithm can calculate the human respiratory displacement.Conclusion:The high precision accelerometer can be used to measure the human respiratory displacement,which provides a new method for the measurement of the human respiratory displacement.展开更多
针对从视频中恢复三维人体模型运动序列时,由于图像特征提取能力有限而导致三维人体模型运动序列重建效果不佳的问题,提出了一种基于Involution卷积的三维人体重建方法。首先为了引入自注意力机制,在ResNet50网络结构中加入Involution算...针对从视频中恢复三维人体模型运动序列时,由于图像特征提取能力有限而导致三维人体模型运动序列重建效果不佳的问题,提出了一种基于Involution卷积的三维人体重建方法。首先为了引入自注意力机制,在ResNet50网络结构中加入Involution算子,获取视频图像帧的特征向量,然后使用姿态估计网络和形状估计网络获取人体姿势以及形状参数,最后使用蒙皮多人线性模型(skinned multi-person linear model, SMPL)生成三维人体模型的运动序列。在三维姿态户外数据集(3D pose in the wild, 3DPW)上与视频人体姿态形状估计推理(video inference for body pose and shape estimation, VIBE)方法以及时间一致性网格恢复(temporally consistent mesh recovery, TCMR)方法进行对比实验,平均精度相比于VIBE、TCMR分别提升了3.1%、0.7%,能够为运动捕捉、三维人体动画制作等工作提供更为准确的三维人体模型。展开更多
文摘A number of automated video shot boundary detection methods for indexing a videosequence to facilitate browsing and retrieval have been proposed in recent years.Among these methods,the dissolve shot boundary isn't accurately detected because it involves the camera operation and objectmovement.In this paper,a method based on support vector machine (SVM) is proposed to detect thedissolve shot boundary in MPEG compressed sequence.The problem of detection between the dissolveshot boundary and other boundaries is considered as two-class classification in our method.Featuresfrom the compressed sequences are directly extracted without decoding them,and the optimal classboundary between two classes are learned from training data by using SVM.Experiments,whichcompare various classification methods,show that using proposed method encourages performance ofvideo shot boundary detection.
文摘An effective approach, mapping the texture for building model based on the digital photogrammetric theory, is proposed. The easily-acquired image sequences from digital video camera on helicopter are used as texture resource, and the correspondence between the space edge in building geometry model and its line feature in image sequences is determined semi-automatically. The experimental results in production of three-dimensional data for car navigation show us an attractive future both in efficiency and effect.
基金Supported by the Future Network Scientific Research Fund Project of Jiangsu Province (No. FNSRFP2021YB26)the Jiangsu Key R&D Fund on Social Development (No. BE2022789)the Science Foundation of Nanjing Institute of Technology (No. ZKJ202003)。
文摘Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse facial features of individual frames.In this paper, a frame-level attention module is integrated into an improved VGG-based frame work and a lightweight facial expression recognition method is proposed.The proposed network takes a sub video cut from an experimental video sequence as its input and generates a fixed-dimension representation.The VGG-based network with an enhanced branch embeds face images into feature vectors.The frame-level attention module learns weights which are used to adaptively aggregate the feature vectors to form a single discriminative video representation.Finally, a regression module outputs the classification results.The experimental results on CK+and AFEW databases show that the recognition rates of the proposed method can achieve the state-of-the-art performance.
基金National Natural Science Foundation of China(No.61573095)Natural Science Foundation of Shanghai,China(No.6ZR1446700)
文摘The devastating effects of wildland fire are an unsolved problem,resulting in human losses and the destruction of natural and economic resources.Convolutional neural network(CNN)is shown to perform very well in the area of object classification.This network has the ability to perform feature extraction and classification within the same architecture.In this paper,we propose a CNN for identifying fire in videos.A deep domain based method for video fire detection is proposed to extract a powerful feature representation of fire.Testing on real video sequences,the proposed approach achieves better classification performance as some of relevant conventional video based fire detection methods and indicates that using CNN to detect fire in videos is efficient.To balance the efficiency and accuracy,the model is fine-tuned considering the nature of the target problem and fire data.Experimental results on benchmark fire datasets reveal the effectiveness of the proposed framework and validate its suitability for fire detection in closed-circuit television surveillance systems compared to state-of-the-art methods.
基金Supported by the President Fund of Graduate University, Chinese Academy of Sciences.
文摘A distortion identification technique is presented based on Hilbert-Huang transform to identify distortion model and distortion frequency of distorted real-world image sequences. The distortion model is identified simply based on Hilbert marginal spectral analysis after empirical mode decomposing. And distortion frequency is identified by analyzing the occurrence frequency of instantaneous frequency components of every intrinsic mode functions. Rational digital frequency filter with suitable cutoff frequency is designed to remove undesired fluctuations based on identification results. Experimental results show that this technique can identify distortion model and distortion frequency of displacement sequence accurately and efficiently. Based on identification results, distorted image sequence can be stabilized effectively.
基金Supported by the Scientific Item of National Power Company(SPKJ0 16 -0 71)
文摘The main purpose of the model is to present how the Unified Modeling Language (UML) can be used for modeling digital video database system (VDBS). It demonstrates the modeling process that can be followed during the analysis phase of complex applications. In order to guarantee the continuity mapping of the models, the authors propose some suggestions to transform the use case diagrams into an object diagram, which is one of the main diagrams for the next development phases.
文摘Segmentation of semantic Video Object Planes (VOP's) from video sequence is a key to the standard MPEG-4 with content-based video coding. In this paper, the approach of automatic Segmentation of VOP's Based on Spatio-Temporal Information (SBSTI) is proposed.The proceeding results demonstrate the good performance of the algorithm.
文摘In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are extracted by unsupervised learning. For this learning, we chose the order-aware competitive learning. After obtaining a set of exemplars for each video, the similarity is computed. Because the numbers and positions of the exemplars are different in each video, we use a similarity computing method called M-distance, which generalizes existing global and local alignment methods using followers to the exemplars. To represent each frame in the video, this paper emphasizes the Frame Signature of the ISO/IEC standard so that the total system, along with its graphical user interface, becomes practical. Experiments on the detection of inserted plagiaristic scenes showed excellent precision-recall curves, with precision values very close to 1. Thus, the proposed system can work as a plagiarism detector for videos. In addition, this method can be regarded as the structuring of unstructured data via numerical labeling by exemplars. Finally, further sophistication of this labeling is discussed.
文摘Objective:Explore the feasibility of the high precision accelerometer for measuring the human respiratory displacement.Methods:A wireless acceleration acquisition system with the low power consumption and the high precision was designed with the high precision acceleration sensor ADXL355 as the core device.Based on the frequency characteristics of the breathing motion and the principle that the displacement can be calculated by the acceleration quadratic integration,two displacement measurement algorithms for the quasi-periodic weak motion are designed.Results:The simulation results show that the proposed algorithm is effective.The experimental results show that the designed acquisition system and algorithm can calculate the human respiratory displacement.Conclusion:The high precision accelerometer can be used to measure the human respiratory displacement,which provides a new method for the measurement of the human respiratory displacement.
文摘针对从视频中恢复三维人体模型运动序列时,由于图像特征提取能力有限而导致三维人体模型运动序列重建效果不佳的问题,提出了一种基于Involution卷积的三维人体重建方法。首先为了引入自注意力机制,在ResNet50网络结构中加入Involution算子,获取视频图像帧的特征向量,然后使用姿态估计网络和形状估计网络获取人体姿势以及形状参数,最后使用蒙皮多人线性模型(skinned multi-person linear model, SMPL)生成三维人体模型的运动序列。在三维姿态户外数据集(3D pose in the wild, 3DPW)上与视频人体姿态形状估计推理(video inference for body pose and shape estimation, VIBE)方法以及时间一致性网格恢复(temporally consistent mesh recovery, TCMR)方法进行对比实验,平均精度相比于VIBE、TCMR分别提升了3.1%、0.7%,能够为运动捕捉、三维人体动画制作等工作提供更为准确的三维人体模型。