The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method in...The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method involves extracting structured data from video frames using facial landmark detection,which is then used as input to the CNN.The customized Convolutional Neural Network method is the date augmented-based CNN model to generate‘fake data’or‘fake images’.This study was carried out using Python and its libraries.We used 242 films from the dataset gathered by the Deep Fake Detection Challenge,of which 199 were made up and the remaining 53 were real.Ten seconds were allotted for each video.There were 318 videos used in all,199 of which were fake and 119 of which were real.Our proposedmethod achieved a testing accuracy of 91.47%,loss of 0.342,and AUC score of 0.92,outperforming two alternative approaches,CNN and MLP-CNN.Furthermore,our method succeeded in greater accuracy than contemporary models such as XceptionNet,Meso-4,EfficientNet-BO,MesoInception-4,VGG-16,and DST-Net.The novelty of this investigation is the development of a new Convolutional Neural Network(CNN)learning model that can accurately detect deep fake face photos.展开更多
Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions i...Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd dataset serves as a catalyst for the development ofmore robust and effective fitnesstracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis.展开更多
Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes an...Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes and their neighbors,but ignore the dynamic dependencies between nodes.To address this challenge,we propose an innovative Adaptive Graph Convolutional Adjacency Matrix Network(TAMGCN),leveraging the attention mechanism to dynamically adjust dependencies between graph nodes.Specifically,we first segment shots and extract features of each frame,then compute the representative features of each shot.Subsequently,we utilize the attention mechanism to dynamically adjust the adjacency matrix of the graph convolutional network to better capture the dynamic dependencies between graph nodes.Finally,we fuse temporal features extracted by Bi-directional Long Short-Term Memory network with structural features extracted by the graph convolutional network to generate high-quality summaries.Extensive experiments are conducted on two benchmark datasets,TVSum and SumMe,yielding F1-scores of 60.8%and 53.2%,respectively.Experimental results demonstrate that our method outperforms most state-of-the-art video summarization techniques.展开更多
Recent research advances in implicit neural representation have shown that a wide range of video data distributions are achieved by sharing model weights for Neural Representation for Videos(NeRV).While explicit metho...Recent research advances in implicit neural representation have shown that a wide range of video data distributions are achieved by sharing model weights for Neural Representation for Videos(NeRV).While explicit methods exist for accurately embedding ownership or copyright information in video data,the nascent NeRV framework has yet to address this issue comprehensively.In response,this paper introduces MarkINeRV,a scheme designed to embed watermarking information into video frames using an invertible neural network watermarking approach to protect the copyright of NeRV,which models the embedding and extraction of watermarks as a pair of inverse processes of a reversible network and employs the same network to achieve embedding and extraction of watermarks.It is just that the information flow is in the opposite direction.Additionally,a video frame quality enhancement module is incorporated to mitigate watermarking information losses in the rendering process and the possibility ofmalicious attacks during transmission,ensuring the accurate extraction of watermarking information through the invertible network’s inverse process.This paper evaluates the accuracy,robustness,and invisibility of MarkINeRV through multiple video datasets.The results demonstrate its efficacy in extracting watermarking information for copyright protection of NeRV.MarkINeRV represents a pioneering investigation into copyright issues surrounding NeRV.展开更多
Resource allocation is an important problem in ubiquitous network. Most of the existing resource allocation methods considering only wireless networks are not suitable for the ubiquitous network environment, and they ...Resource allocation is an important problem in ubiquitous network. Most of the existing resource allocation methods considering only wireless networks are not suitable for the ubiquitous network environment, and they will harm the interest of individual users with instable resource requirements. This paper considers the multi-point video surveillance scenarios in a complex network environment with both wired and wireless networks. We introduce the utility estimated by the total costs of an individual network user. The problem is studied through mathematical modeling and we propose an improved problem-specific branch-and-cut algorithm to solve it. The algorithm follows the divide-and-conquer principle and fully considers the duality feature of network selection. The experiment is conducted by simulation through C and Lingo. And it shows that compared with a centralized random allocation scheme and a cost greed allocation scheme, the proposed scheme has better per- formance of reducing the total costs by 13.0% and 30.6% respectively for the user.展开更多
Action recognition is an important topic in computer vision. Recently, deep learning technologies have been successfully used in lots of applications including video data for sloving recognition problems. However, mos...Action recognition is an important topic in computer vision. Recently, deep learning technologies have been successfully used in lots of applications including video data for sloving recognition problems. However, most existing deep learning based recognition frameworks are not optimized for action in the surveillance videos. In this paper, we propose a novel method to deal with the recognition of different types of actions in outdoor surveillance videos. The proposed method first introduces motion compensation to improve the detection of human target. Then, it uses three different types of deep models with single and sequenced images as inputs for the recognition of different types of actions. Finally, predictions from different models are fused with a linear model. Experimental results show that the proposed method works well on the real surveillance videos.展开更多
For intelligent surveillance videos,anomaly detection is extremely important.Deep learning algorithms have been popular for evaluating realtime surveillance recordings,like traffic accidents,and criminal or unlawful i...For intelligent surveillance videos,anomaly detection is extremely important.Deep learning algorithms have been popular for evaluating realtime surveillance recordings,like traffic accidents,and criminal or unlawful incidents such as suicide attempts.Nevertheless,Deep learning methods for classification,like convolutional neural networks,necessitate a lot of computing power.Quantum computing is a branch of technology that solves abnormal and complex problems using quantum mechanics.As a result,the focus of this research is on developing a hybrid quantum computing model which is based on deep learning.This research develops a Quantum Computing-based Convolutional Neural Network(QC-CNN)to extract features and classify anomalies from surveillance footage.A Quantum-based Circuit,such as the real amplitude circuit,is utilized to improve the performance of the model.As far as my research,this is the first work to employ quantum deep learning techniques to classify anomalous events in video surveillance applications.There are 13 anomalies classified from the UCF-crime dataset.Based on experimental results,the proposed model is capable of efficiently classifying data concerning confusion matrix,Receiver Operating Characteristic(ROC),accuracy,Area Under Curve(AUC),precision,recall as well as F1-score.The proposed QC-CNN has attained the best accuracy of 95.65 percent which is 5.37%greater when compared to other existing models.To measure the efficiency of the proposed work,QC-CNN is also evaluated with classical and quantum models.展开更多
To transfer the color data from a device (video camera) dependent color space into a device? independent color space, a multilayer feedforward network with the error backpropagation (BP) learning rule, was regarded ...To transfer the color data from a device (video camera) dependent color space into a device? independent color space, a multilayer feedforward network with the error backpropagation (BP) learning rule, was regarded as a nonlinear transformer realizing the mapping from the RGB color space to CIELAB color space. A variety of mapping accuracy were obtained with different network structures. BP neural networks can provide a satisfactory mapping accuracy in the field of color space transformation for video cameras.展开更多
A novel bandwidth prediction and control scheme is proposed for video transmission over an ad boc network. The scheme is based on cross-layer, feedback, and Bayesian network techniques. The impacts of video quality ar...A novel bandwidth prediction and control scheme is proposed for video transmission over an ad boc network. The scheme is based on cross-layer, feedback, and Bayesian network techniques. The impacts of video quality are formulized and deduced. The relevant factors are obtained by a cross-layer mechanism or Feedback method. According to these relevant factors, the variable set and the Bayesian network topology are determined. Then a Bayesian network prediction model is constructed. The results of the prediction can be used as the bandwidth of the mobile ad hoc network (MANET). According to the bandwidth, the video encoder is controlled to dynamically adjust and encode the right bit rates of a real-time video stream. Integrated simulation of a video streaming communication system is implemented to validate the proposed solution. In contrast to the conventional transfer scheme, the results of the experiment indicate that the proposed scheme can make the best use of the network bandwidth; there are considerable improvements in the packet loss and the visual quality of real-time video.K展开更多
We are interested in providing Video-on-Demand (VoD) streaming service to a large population of clients using peer-to-peer (P2P) approach. Given the asynchronous demands from multiple clients, continuously changing of...We are interested in providing Video-on-Demand (VoD) streaming service to a large population of clients using peer-to-peer (P2P) approach. Given the asynchronous demands from multiple clients, continuously changing of the buffered contents, and the continuous video display requirement, how to collaborate with potential partners to get expected data for future content delivery are very important and challenging. In this paper, we develop a novel scheduling algorithm based on deadline- aware network coding (DNC) to fully exploit the network resource for efficient VoD service. DNC generalizes the existing net- work coding (NC) paradigm, an elegant solution for ubiquitous data distribution. Yet, with deadline awareness, DNC improves the network throughput and meanwhile avoid missing the play deadline in high probability, which is a major deficiency of the con- ventional NC. Extensive simulation results demonstrated that DNC achieves high streaming continuity even in tight network conditions.展开更多
In this paper, we propose a multi-source multi-path video streaming system for supporting high quality concurrent video-on-demand (VoD) services over wireless mesh networks (WMNs), and leverage forward error correctio...In this paper, we propose a multi-source multi-path video streaming system for supporting high quality concurrent video-on-demand (VoD) services over wireless mesh networks (WMNs), and leverage forward error correction to enhance the error resilience of the system. By taking wireless interference into consideration, we present a more realistic networking model to capture the characteristics of WMNs and then design a route selection scheme using a joint rate/interference-distortion optimiza- tion framework to help the system optimally select concurrent streaming paths. We mathematically formulate such a route selec- tion problem, and solve it heuristically using genetic algorithm. Simulation results demonstrate the effectiveness of our proposed scheme.展开更多
The devastating effects of wildland fire are an unsolved problem,resulting in human losses and the destruction of natural and economic resources.Convolutional neural network(CNN)is shown to perform very well in the ar...The devastating effects of wildland fire are an unsolved problem,resulting in human losses and the destruction of natural and economic resources.Convolutional neural network(CNN)is shown to perform very well in the area of object classification.This network has the ability to perform feature extraction and classification within the same architecture.In this paper,we propose a CNN for identifying fire in videos.A deep domain based method for video fire detection is proposed to extract a powerful feature representation of fire.Testing on real video sequences,the proposed approach achieves better classification performance as some of relevant conventional video based fire detection methods and indicates that using CNN to detect fire in videos is efficient.To balance the efficiency and accuracy,the model is fine-tuned considering the nature of the target problem and fire data.Experimental results on benchmark fire datasets reveal the effectiveness of the proposed framework and validate its suitability for fire detection in closed-circuit television surveillance systems compared to state-of-the-art methods.展开更多
Wireless Sensor Network (WSN) has been emerging in the last decade as a powerful tool for connecting physical and digital world. WSN has been used in many applications such habitat monitoring, building monitoring, sma...Wireless Sensor Network (WSN) has been emerging in the last decade as a powerful tool for connecting physical and digital world. WSN has been used in many applications such habitat monitoring, building monitoring, smart grid and pipeline monitoring. In addition, few researchers have been experimenting with WSN in many mission-critical applications such as military applications. This paper surveys the literature for experimenting work done in border surveillance and intrusion detection using the technology of WSN. The potential benefits of using WSN in border surveillance are huge;however, up to our knowledge very few attempts of solving many critical issues about this application could be found in the literature.展开更多
In recent years,the number of Gun-related incidents has crossed over 250,000 per year and over 85%of the existing 1 billion firearms are in civilian hands,manual monitoring has not proven effective in detecting firear...In recent years,the number of Gun-related incidents has crossed over 250,000 per year and over 85%of the existing 1 billion firearms are in civilian hands,manual monitoring has not proven effective in detecting firearms.which is why an automated weapon detection system is needed.Various automated convolutional neural networks(CNN)weapon detection systems have been proposed in the past to generate good results.However,These techniques have high computation overhead and are slow to provide real-time detection which is essential for the weapon detection system.These models have a high rate of false negatives because they often fail to detect the guns due to the low quality and visibility issues of surveillance videos.This research work aims to minimize the rate of false negatives and false positives in weapon detection while keeping the speed of detection as a key parameter.The proposed framework is based on You Only Look Once(YOLO)and Area of Interest(AOI).Initially,themodels take pre-processed frames where the background is removed by the use of the Gaussian blur algorithm.The proposed architecture will be assessed through various performance parameters such as False Negative,False Positive,precision,recall rate,and F1 score.The results of this research work make it clear that due to YOLO-v5s high recall rate and speed of detection are achieved.Speed reached 0.010 s per frame compared to the 0.17 s of the Faster R-CNN.It is promising to be used in the field of security and weapon detection.展开更多
Objective:To explore and visualize the connectivity of suspected Ebola cases and surveillance callers who used cellphone technology in Moyamba District in Sierra Leone for Ebola surveillance,and to examine the demogra...Objective:To explore and visualize the connectivity of suspected Ebola cases and surveillance callers who used cellphone technology in Moyamba District in Sierra Leone for Ebola surveillance,and to examine the demographic differences and characteristics of Ebola surveillance callers who make more calls as well as those callers who are more likely to make at least one positive Ebola call.Methods:Surveillance data for 393 suspected Ebola cases(192 males,201 females) were collected from October 23,2014 to June 28,2015 using cellphone technology.UCINET and Net Draw software were used to explore and visualize the social connectivity between callers and suspected Ebola cases.Poisson and logistic regression analyses were used to do multivariable analysis.Results:The entire social network was comprised of 393 ties and 745 nodes.Women(AOR=0.33,95% CI [0.14,0.81]) were associated with decreased odds of making at least one positive Ebola surveillance call compared to men.Women(IR= 0.63,95% CI [0.49,0.82]) were also associated with making fewer Ebola surveillance calls compared to men.Conclusion:Social network visualization can analyze syndromic surveillance data for Ebola collected by cellphone technology with unique insights.展开更多
The increasing popularity of smart mobile devices and the rise of online services has increased the requirements for efficient dissemination of social video contents. In this paper,we study the problem of distributing...The increasing popularity of smart mobile devices and the rise of online services has increased the requirements for efficient dissemination of social video contents. In this paper,we study the problem of distributing video from cloud server to users in partially connected cooperative D2 D network using network coding. In such a scenario, the transmission conflicts occur from simultaneous transmissions of multiple devices, and the scheduling decision should be made not only on the encoded packets but also on the set of transmitting devices. We analyze the lower bound and give an integer linear formulation of the joint optimization problem over the set of transmitting devices and the packet combinations.We also propose a heuristic solution for this setup using a conflict graph and local graph at every device. Simulation results show that our coding scheme significantly reduces the number of transmission slots, which will increase the efficiency of video delivery.展开更多
Most of previous video recording devices in mobile vehicles commonly store captured video contents locally. With the rapid development of 4G/Wi Fi networks, there emerges a new trend to equip video recording devices w...Most of previous video recording devices in mobile vehicles commonly store captured video contents locally. With the rapid development of 4G/Wi Fi networks, there emerges a new trend to equip video recording devices with wireless interfaces to enable video uploading to the cloud for video playback in a later time point. In this paper, we propose a QoE-aware mobile cloud video recording scheme in the roadside vehicular networks, which can adaptively select the proper wireless interface and video bitrate for video uploading to the cloud. To maximize the total utility, we need to design a control strategy to carefully balance the transmission cost and the achieved QoE for users. To this purpose, we investigate the tradeoff between cost incurred by uploading through cellular networks and the achieved QoE of users. We apply the optimization framework to solve the formulated problem and design an online scheduling algorithm. We also conduct extensive trace-driven simulations and our results show that our algorithm achieves a good balance between the transmission cost and user QoE.展开更多
In video surveillance, there are many interference factors such as target changes, complex scenes, and target deformation in the moving object tracking. In order to resolve this issue, based on the comparative analysi...In video surveillance, there are many interference factors such as target changes, complex scenes, and target deformation in the moving object tracking. In order to resolve this issue, based on the comparative analysis of several common moving object detection methods, a moving object detection and recognition algorithm combined frame difference with background subtraction is presented in this paper. In the algorithm, we first calculate the average of the values of the gray of the continuous multi-frame image in the dynamic image, and then get background image obtained by the statistical average of the continuous image sequence, that is, the continuous interception of the N-frame images are summed, and find the average. In this case, weight of object information has been increasing, and also restrains the static background. Eventually the motion detection image contains both the target contour and more target information of the target contour point from the background image, so as to achieve separating the moving target from the image. The simulation results show the effectiveness of the proposed algorithm.展开更多
With correlating with human perception, quality of experience(Qo E) is also an important measurement in evaluation of video quality in addition to quality of service(Qo S). A cross-layer scheme based on Lyapunov optim...With correlating with human perception, quality of experience(Qo E) is also an important measurement in evaluation of video quality in addition to quality of service(Qo S). A cross-layer scheme based on Lyapunov optimization framework for H.264/AVC video streaming over wireless Ad hoc networks is proposed, with increasing both Qo E and Qo S performances. Different from existing works, this scheme routes and schedules video packets according to the statuses of the frame buffers at the destination nodes to reduce buffer underflows and to increase video playout continuity. The waiting time of head-ofline packets of data queues are considered in routing and scheduling to reduce the average end-to-end delay of video sessions. Different types of packets are allocated with different priorities according to their generated rates under H.264/AVC. To reduce the computational complexity, a distributed media access control policy and a power control algorithm cooperating with the media access policy are proposed. Simulation results show that, compared with existing schemes, this scheme can improve both the Qo S and Qo E performances. The average peak signal-to-noise ratio(PSNR) of the received video streams is also increased.展开更多
Wyner-Ziv Video Coding (WZVC) is considered as a promising video coding scheme for Wireless Video Sensor Networks (WVSNs) due to its high compression efficiency and error resilience functionalities, as well as its...Wyner-Ziv Video Coding (WZVC) is considered as a promising video coding scheme for Wireless Video Sensor Networks (WVSNs) due to its high compression efficiency and error resilience functionalities, as well as its low encoding complex- ity. To achieve a good Rate-Distortion (R-D) per- formance, the current WZVC paradi^prls usually a- dopt an end-to-end rate control scheme in which the decoder repeatedly requests the additional deco- ding data from the encoder for decoding Wyner-Ziv frames. Therefore, the waiting time of the additional decoding data is especially long in multihop WVSNs. In this paper, we propose a novel pro- gressive in-network rate control scheme for WZVC. The proposed in-network puncturing-based rate control scheme transfers the partial channel codes puncturing task from the encoder to the relay nodes. Then, the decoder can request the addition- al decoding data from the relay nodes instead of the encoder, and the total waiting time for deco- ding Wyner-Ziv frames is reduced consequently. Simulation results validate the proposed rate con- trol scheme.展开更多
基金Science and Technology Funds from the Liaoning Education Department(Serial Number:LJKZ0104).
文摘The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method involves extracting structured data from video frames using facial landmark detection,which is then used as input to the CNN.The customized Convolutional Neural Network method is the date augmented-based CNN model to generate‘fake data’or‘fake images’.This study was carried out using Python and its libraries.We used 242 films from the dataset gathered by the Deep Fake Detection Challenge,of which 199 were made up and the remaining 53 were real.Ten seconds were allotted for each video.There were 318 videos used in all,199 of which were fake and 119 of which were real.Our proposedmethod achieved a testing accuracy of 91.47%,loss of 0.342,and AUC score of 0.92,outperforming two alternative approaches,CNN and MLP-CNN.Furthermore,our method succeeded in greater accuracy than contemporary models such as XceptionNet,Meso-4,EfficientNet-BO,MesoInception-4,VGG-16,and DST-Net.The novelty of this investigation is the development of a new Convolutional Neural Network(CNN)learning model that can accurately detect deep fake face photos.
文摘Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd dataset serves as a catalyst for the development ofmore robust and effective fitnesstracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis.
基金This work was supported by Natural Science Foundation of Gansu Province under Grant Nos.21JR7RA570,20JR10RA334Basic Research Program of Gansu Province No.22JR11RA106,Gansu University of Political Science and Law Major Scientific Research and Innovation Projects under Grant No.GZF2020XZDA03+1 种基金the Young Doctoral Fund Project of Higher Education Institutions in Gansu Province in 2022 under Grant No.2022QB-123,Gansu Province Higher Education Innovation Fund Project under Grant No.2022A-097the University-Level Research Funding Project under Grant No.GZFXQNLW022 and University-Level Innovative Research Team of Gansu University of Political Science and Law.
文摘Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes and their neighbors,but ignore the dynamic dependencies between nodes.To address this challenge,we propose an innovative Adaptive Graph Convolutional Adjacency Matrix Network(TAMGCN),leveraging the attention mechanism to dynamically adjust dependencies between graph nodes.Specifically,we first segment shots and extract features of each frame,then compute the representative features of each shot.Subsequently,we utilize the attention mechanism to dynamically adjust the adjacency matrix of the graph convolutional network to better capture the dynamic dependencies between graph nodes.Finally,we fuse temporal features extracted by Bi-directional Long Short-Term Memory network with structural features extracted by the graph convolutional network to generate high-quality summaries.Extensive experiments are conducted on two benchmark datasets,TVSum and SumMe,yielding F1-scores of 60.8%and 53.2%,respectively.Experimental results demonstrate that our method outperforms most state-of-the-art video summarization techniques.
基金supported by the National Natural Science Foundation of China,with Fund Numbers 62272478,62102451the National Defense Science and Technology Independent Research Project(Intelligent Information Hiding Technology and Its Applications in a Certain Field)and Science and Technology Innovation Team Innovative Research Project“Research on Key Technologies for Intelligent Information Hiding”with Fund Number ZZKY20222102.
文摘Recent research advances in implicit neural representation have shown that a wide range of video data distributions are achieved by sharing model weights for Neural Representation for Videos(NeRV).While explicit methods exist for accurately embedding ownership or copyright information in video data,the nascent NeRV framework has yet to address this issue comprehensively.In response,this paper introduces MarkINeRV,a scheme designed to embed watermarking information into video frames using an invertible neural network watermarking approach to protect the copyright of NeRV,which models the embedding and extraction of watermarks as a pair of inverse processes of a reversible network and employs the same network to achieve embedding and extraction of watermarks.It is just that the information flow is in the opposite direction.Additionally,a video frame quality enhancement module is incorporated to mitigate watermarking information losses in the rendering process and the possibility ofmalicious attacks during transmission,ensuring the accurate extraction of watermarking information through the invertible network’s inverse process.This paper evaluates the accuracy,robustness,and invisibility of MarkINeRV through multiple video datasets.The results demonstrate its efficacy in extracting watermarking information for copyright protection of NeRV.MarkINeRV represents a pioneering investigation into copyright issues surrounding NeRV.
基金Supported by the National Science and Technology Major Project (No.2011ZX03005-004-04)the National Grand Fundamental Research 973 Program of China (No.2011CB302-905)+2 种基金the National Natural Science Foundation of China (No.61170058,61272133,and 51274202)the Research Fund for the Doctoral Program of Higher Education of China (No.20103402110041)the Suzhou Fundamental Research Project (No.SYG201143)
文摘Resource allocation is an important problem in ubiquitous network. Most of the existing resource allocation methods considering only wireless networks are not suitable for the ubiquitous network environment, and they will harm the interest of individual users with instable resource requirements. This paper considers the multi-point video surveillance scenarios in a complex network environment with both wired and wireless networks. We introduce the utility estimated by the total costs of an individual network user. The problem is studied through mathematical modeling and we propose an improved problem-specific branch-and-cut algorithm to solve it. The algorithm follows the divide-and-conquer principle and fully considers the duality feature of network selection. The experiment is conducted by simulation through C and Lingo. And it shows that compared with a centralized random allocation scheme and a cost greed allocation scheme, the proposed scheme has better per- formance of reducing the total costs by 13.0% and 30.6% respectively for the user.
文摘Action recognition is an important topic in computer vision. Recently, deep learning technologies have been successfully used in lots of applications including video data for sloving recognition problems. However, most existing deep learning based recognition frameworks are not optimized for action in the surveillance videos. In this paper, we propose a novel method to deal with the recognition of different types of actions in outdoor surveillance videos. The proposed method first introduces motion compensation to improve the detection of human target. Then, it uses three different types of deep models with single and sequenced images as inputs for the recognition of different types of actions. Finally, predictions from different models are fused with a linear model. Experimental results show that the proposed method works well on the real surveillance videos.
文摘For intelligent surveillance videos,anomaly detection is extremely important.Deep learning algorithms have been popular for evaluating realtime surveillance recordings,like traffic accidents,and criminal or unlawful incidents such as suicide attempts.Nevertheless,Deep learning methods for classification,like convolutional neural networks,necessitate a lot of computing power.Quantum computing is a branch of technology that solves abnormal and complex problems using quantum mechanics.As a result,the focus of this research is on developing a hybrid quantum computing model which is based on deep learning.This research develops a Quantum Computing-based Convolutional Neural Network(QC-CNN)to extract features and classify anomalies from surveillance footage.A Quantum-based Circuit,such as the real amplitude circuit,is utilized to improve the performance of the model.As far as my research,this is the first work to employ quantum deep learning techniques to classify anomalous events in video surveillance applications.There are 13 anomalies classified from the UCF-crime dataset.Based on experimental results,the proposed model is capable of efficiently classifying data concerning confusion matrix,Receiver Operating Characteristic(ROC),accuracy,Area Under Curve(AUC),precision,recall as well as F1-score.The proposed QC-CNN has attained the best accuracy of 95.65 percent which is 5.37%greater when compared to other existing models.To measure the efficiency of the proposed work,QC-CNN is also evaluated with classical and quantum models.
文摘To transfer the color data from a device (video camera) dependent color space into a device? independent color space, a multilayer feedforward network with the error backpropagation (BP) learning rule, was regarded as a nonlinear transformer realizing the mapping from the RGB color space to CIELAB color space. A variety of mapping accuracy were obtained with different network structures. BP neural networks can provide a satisfactory mapping accuracy in the field of color space transformation for video cameras.
基金The National High Technology Research and Development Program of China (863Program) (No.2003AA1Z2130)the Scienceand Technology Project of Zhejiang Province(No.2005C11001-02)
文摘A novel bandwidth prediction and control scheme is proposed for video transmission over an ad boc network. The scheme is based on cross-layer, feedback, and Bayesian network techniques. The impacts of video quality are formulized and deduced. The relevant factors are obtained by a cross-layer mechanism or Feedback method. According to these relevant factors, the variable set and the Bayesian network topology are determined. Then a Bayesian network prediction model is constructed. The results of the prediction can be used as the bandwidth of the mobile ad hoc network (MANET). According to the bandwidth, the video encoder is controlled to dynamically adjust and encode the right bit rates of a real-time video stream. Integrated simulation of a video streaming communication system is implemented to validate the proposed solution. In contrast to the conventional transfer scheme, the results of the experiment indicate that the proposed scheme can make the best use of the network bandwidth; there are considerable improvements in the packet loss and the visual quality of real-time video.K
基金Project (No. DAG05/06.EG05) supported by the Research GrantCouncil (RGC) of Hong Kong, China
文摘We are interested in providing Video-on-Demand (VoD) streaming service to a large population of clients using peer-to-peer (P2P) approach. Given the asynchronous demands from multiple clients, continuously changing of the buffered contents, and the continuous video display requirement, how to collaborate with potential partners to get expected data for future content delivery are very important and challenging. In this paper, we develop a novel scheduling algorithm based on deadline- aware network coding (DNC) to fully exploit the network resource for efficient VoD service. DNC generalizes the existing net- work coding (NC) paradigm, an elegant solution for ubiquitous data distribution. Yet, with deadline awareness, DNC improves the network throughput and meanwhile avoid missing the play deadline in high probability, which is a major deficiency of the con- ventional NC. Extensive simulation results demonstrated that DNC achieves high streaming continuity even in tight network conditions.
文摘In this paper, we propose a multi-source multi-path video streaming system for supporting high quality concurrent video-on-demand (VoD) services over wireless mesh networks (WMNs), and leverage forward error correction to enhance the error resilience of the system. By taking wireless interference into consideration, we present a more realistic networking model to capture the characteristics of WMNs and then design a route selection scheme using a joint rate/interference-distortion optimiza- tion framework to help the system optimally select concurrent streaming paths. We mathematically formulate such a route selec- tion problem, and solve it heuristically using genetic algorithm. Simulation results demonstrate the effectiveness of our proposed scheme.
基金National Natural Science Foundation of China(No.61573095)Natural Science Foundation of Shanghai,China(No.6ZR1446700)
文摘The devastating effects of wildland fire are an unsolved problem,resulting in human losses and the destruction of natural and economic resources.Convolutional neural network(CNN)is shown to perform very well in the area of object classification.This network has the ability to perform feature extraction and classification within the same architecture.In this paper,we propose a CNN for identifying fire in videos.A deep domain based method for video fire detection is proposed to extract a powerful feature representation of fire.Testing on real video sequences,the proposed approach achieves better classification performance as some of relevant conventional video based fire detection methods and indicates that using CNN to detect fire in videos is efficient.To balance the efficiency and accuracy,the model is fine-tuned considering the nature of the target problem and fire data.Experimental results on benchmark fire datasets reveal the effectiveness of the proposed framework and validate its suitability for fire detection in closed-circuit television surveillance systems compared to state-of-the-art methods.
文摘Wireless Sensor Network (WSN) has been emerging in the last decade as a powerful tool for connecting physical and digital world. WSN has been used in many applications such habitat monitoring, building monitoring, smart grid and pipeline monitoring. In addition, few researchers have been experimenting with WSN in many mission-critical applications such as military applications. This paper surveys the literature for experimenting work done in border surveillance and intrusion detection using the technology of WSN. The potential benefits of using WSN in border surveillance are huge;however, up to our knowledge very few attempts of solving many critical issues about this application could be found in the literature.
基金We deeply acknowledge Taif University for Supporting and funding this study through Taif University Researchers Supporting Project Number(TURSP-2020/115),Taif University,Taif,Saudi Arabia.
文摘In recent years,the number of Gun-related incidents has crossed over 250,000 per year and over 85%of the existing 1 billion firearms are in civilian hands,manual monitoring has not proven effective in detecting firearms.which is why an automated weapon detection system is needed.Various automated convolutional neural networks(CNN)weapon detection systems have been proposed in the past to generate good results.However,These techniques have high computation overhead and are slow to provide real-time detection which is essential for the weapon detection system.These models have a high rate of false negatives because they often fail to detect the guns due to the low quality and visibility issues of surveillance videos.This research work aims to minimize the rate of false negatives and false positives in weapon detection while keeping the speed of detection as a key parameter.The proposed framework is based on You Only Look Once(YOLO)and Area of Interest(AOI).Initially,themodels take pre-processed frames where the background is removed by the use of the Gaussian blur algorithm.The proposed architecture will be assessed through various performance parameters such as False Negative,False Positive,precision,recall rate,and F1 score.The results of this research work make it clear that due to YOLO-v5s high recall rate and speed of detection are achieved.Speed reached 0.010 s per frame compared to the 0.17 s of the Faster R-CNN.It is promising to be used in the field of security and weapon detection.
文摘Objective:To explore and visualize the connectivity of suspected Ebola cases and surveillance callers who used cellphone technology in Moyamba District in Sierra Leone for Ebola surveillance,and to examine the demographic differences and characteristics of Ebola surveillance callers who make more calls as well as those callers who are more likely to make at least one positive Ebola call.Methods:Surveillance data for 393 suspected Ebola cases(192 males,201 females) were collected from October 23,2014 to June 28,2015 using cellphone technology.UCINET and Net Draw software were used to explore and visualize the social connectivity between callers and suspected Ebola cases.Poisson and logistic regression analyses were used to do multivariable analysis.Results:The entire social network was comprised of 393 ties and 745 nodes.Women(AOR=0.33,95% CI [0.14,0.81]) were associated with decreased odds of making at least one positive Ebola surveillance call compared to men.Women(IR= 0.63,95% CI [0.49,0.82]) were also associated with making fewer Ebola surveillance calls compared to men.Conclusion:Social network visualization can analyze syndromic surveillance data for Ebola collected by cellphone technology with unique insights.
基金supported by Fundamental Research Funds for the Central Universities(No.SWU115002,No.XDJK2015C104)
文摘The increasing popularity of smart mobile devices and the rise of online services has increased the requirements for efficient dissemination of social video contents. In this paper,we study the problem of distributing video from cloud server to users in partially connected cooperative D2 D network using network coding. In such a scenario, the transmission conflicts occur from simultaneous transmissions of multiple devices, and the scheduling decision should be made not only on the encoded packets but also on the set of transmitting devices. We analyze the lower bound and give an integer linear formulation of the joint optimization problem over the set of transmitting devices and the packet combinations.We also propose a heuristic solution for this setup using a conflict graph and local graph at every device. Simulation results show that our coding scheme significantly reduces the number of transmission slots, which will increase the efficiency of video delivery.
基金supported in part by the National Science Foundation of China under Grant 61272397,Grant 61572538,Grant 61174152,Grant 61331008in part by the Guangdong Natural Science Funds for Distinguished Young Scholar under Grant S20120011187
文摘Most of previous video recording devices in mobile vehicles commonly store captured video contents locally. With the rapid development of 4G/Wi Fi networks, there emerges a new trend to equip video recording devices with wireless interfaces to enable video uploading to the cloud for video playback in a later time point. In this paper, we propose a QoE-aware mobile cloud video recording scheme in the roadside vehicular networks, which can adaptively select the proper wireless interface and video bitrate for video uploading to the cloud. To maximize the total utility, we need to design a control strategy to carefully balance the transmission cost and the achieved QoE for users. To this purpose, we investigate the tradeoff between cost incurred by uploading through cellular networks and the achieved QoE of users. We apply the optimization framework to solve the formulated problem and design an online scheduling algorithm. We also conduct extensive trace-driven simulations and our results show that our algorithm achieves a good balance between the transmission cost and user QoE.
文摘In video surveillance, there are many interference factors such as target changes, complex scenes, and target deformation in the moving object tracking. In order to resolve this issue, based on the comparative analysis of several common moving object detection methods, a moving object detection and recognition algorithm combined frame difference with background subtraction is presented in this paper. In the algorithm, we first calculate the average of the values of the gray of the continuous multi-frame image in the dynamic image, and then get background image obtained by the statistical average of the continuous image sequence, that is, the continuous interception of the N-frame images are summed, and find the average. In this case, weight of object information has been increasing, and also restrains the static background. Eventually the motion detection image contains both the target contour and more target information of the target contour point from the background image, so as to achieve separating the moving target from the image. The simulation results show the effectiveness of the proposed algorithm.
文摘With correlating with human perception, quality of experience(Qo E) is also an important measurement in evaluation of video quality in addition to quality of service(Qo S). A cross-layer scheme based on Lyapunov optimization framework for H.264/AVC video streaming over wireless Ad hoc networks is proposed, with increasing both Qo E and Qo S performances. Different from existing works, this scheme routes and schedules video packets according to the statuses of the frame buffers at the destination nodes to reduce buffer underflows and to increase video playout continuity. The waiting time of head-ofline packets of data queues are considered in routing and scheduling to reduce the average end-to-end delay of video sessions. Different types of packets are allocated with different priorities according to their generated rates under H.264/AVC. To reduce the computational complexity, a distributed media access control policy and a power control algorithm cooperating with the media access policy are proposed. Simulation results show that, compared with existing schemes, this scheme can improve both the Qo S and Qo E performances. The average peak signal-to-noise ratio(PSNR) of the received video streams is also increased.
基金This paper was supported by the National Key Basic Re- search Program of China under Grant No. 2011 CB302701 the National Natural Science Foundation of China under Grants No. 60833009, No. 61133015+2 种基金 the China National Funds for Distinguished Young Scientists under Grant No. 60925010 the Funds for Creative Research Groups of China under Grant No. 61121001 the Program for Changjiang Scholars and Innovative Research Team in University under Grant No. IRT1049.
文摘Wyner-Ziv Video Coding (WZVC) is considered as a promising video coding scheme for Wireless Video Sensor Networks (WVSNs) due to its high compression efficiency and error resilience functionalities, as well as its low encoding complex- ity. To achieve a good Rate-Distortion (R-D) per- formance, the current WZVC paradi^prls usually a- dopt an end-to-end rate control scheme in which the decoder repeatedly requests the additional deco- ding data from the encoder for decoding Wyner-Ziv frames. Therefore, the waiting time of the additional decoding data is especially long in multihop WVSNs. In this paper, we propose a novel pro- gressive in-network rate control scheme for WZVC. The proposed in-network puncturing-based rate control scheme transfers the partial channel codes puncturing task from the encoder to the relay nodes. Then, the decoder can request the addition- al decoding data from the relay nodes instead of the encoder, and the total waiting time for deco- ding Wyner-Ziv frames is reduced consequently. Simulation results validate the proposed rate con- trol scheme.