Video transcoding is to create multiple representations of a video for content adaptation.It is deemed as a core technique in Adaptive BitRate(ABR)streaming.How to manage video transcoding affects the performance of A...Video transcoding is to create multiple representations of a video for content adaptation.It is deemed as a core technique in Adaptive BitRate(ABR)streaming.How to manage video transcoding affects the performance of ABR streaming in various aspects,including operational cost,streaming delays,Quality of Experience(QoE),etc.Therefore,the problems of implementing video transcoding in ABR streaming must be systematically studied to improve the overall performance of the streaming services.These problems become more worthy of investigation with the emergence of the edge-cloud continuum,which makes the resource allocation for video transcoding more complicated.To this end,this paper provides an investigation of the main technical problems related to video transcoding in ABR streaming,including designing a rate profile for video transcoding,providing resources for video transcoding in clouds,and caching multi-bitrate video contents in networks,etc.We analyze these problems from the perspective of resource allocation in the edge-cloud continuum and cast them into resource and Quality of Service(QoS)optimization problems.The goal is to minimize resource consumption while guaranteeing the QoS for ABR streaming.We also discuss some promising research directions for the ABR streaming services.展开更多
Adaptive bitrate video streaming(ABR)has become a critical technique for mobile video streaming to cope with time-varying network conditions and different user preferences.However,there are still many problems in achi...Adaptive bitrate video streaming(ABR)has become a critical technique for mobile video streaming to cope with time-varying network conditions and different user preferences.However,there are still many problems in achieving high-quality ABR video streaming over cellular networks.Mobile Edge Computing(MEC)is a promising paradigm to overcome the above problems by providing video transcoding capability and caching the ABR video streaming within the radio access network(RAN).In this paper,we propose a flexible transcoding strategy to provide viewers with low-latency video streaming services in the MEC networks under the limited storage,computing,and spectrum resources.According to the information collected from users,the MEC server acts as a controlling component to adjust the transcoding strategy flexibly based on optimizing the video caching placement strategy.Specifically,we cache the proper bitrate version of the video segments at the edge servers and select the appropriate bitrate version of the video segments to perform transcoding under jointly considering access control,resource allocation,and user preferences.We formulate this problem as a nonconvex optimization and mixed combinatorial problem.Moreover,the simulation results indicate that our proposed algorithm can ensure a low-latency viewing experience for users.展开更多
With the popularity of smart handheld devices, mobile streaming video has multiplied the global network traffic in recent years. A huge concern of users' quality of experience(Qo E) has made rate adaptation method...With the popularity of smart handheld devices, mobile streaming video has multiplied the global network traffic in recent years. A huge concern of users' quality of experience(Qo E) has made rate adaptation methods very attractive. In this paper, we propose a two-phase rate adaptation strategy to improve users' real-time video Qo E. First, to measure and assess video Qo E, we provide a continuous Qo E prediction engine modeled by RNN recurrent neural network. Different from traditional Qo E models which consider the Qo E-aware factors separately or incompletely, our RNN-Qo E model accounts for three descriptive factors(video quality, rebuffering, and rate change) and reflects the impact of cognitive memory and recency. Besides, the video playing is separated into the initial startup phase and the steady playback phase, and we takes different optimization goals for each phase: the former aims at shortening the startup delay while the latter ameliorates the video quality and the rebufferings. Simulation results have shown that RNN-Qo E can follow the subjective Qo E quite well, and the proposed strategy can effectively reduce the occurrence of rebufferings caused by the mismatch between the requested video rates and the fluctuated throughput and attains standout performance on real-time Qo E compared with classical rate adaption methods.展开更多
With the increasing popularity of solid sate lighting devices, Visible Light Communication (VLC) is globally recognized as an advanced and promising technology to realize short-range, high speed as well as large capac...With the increasing popularity of solid sate lighting devices, Visible Light Communication (VLC) is globally recognized as an advanced and promising technology to realize short-range, high speed as well as large capacity wireless data transmission. In this paper, we propose a prototype of real-time audio and video broadcast system using inexpensive commercially available light emitting diode (LED) lamps. Experimental results show that real-time high quality audio and video with the maximum distance of 3 m can be achieved through proper layout of LED sources and improvement of concentration effects. Lighting model within room environment is designed and simulated which indicates close relationship between layout of light sources and distribution of illuminance.展开更多
Efficient video delivery involves the transcoding of the original sequence into various resolutions,bitrates and standards,in order to match viewers’capabilities.Since video coding and transcoding are computationally...Efficient video delivery involves the transcoding of the original sequence into various resolutions,bitrates and standards,in order to match viewers’capabilities.Since video coding and transcoding are computationally demanding,performing a portion of these tasks at the network edges promises to decrease both the workload and network traffic towards the data centers of media providers.Motivated by the increasing popularity of live casting on social media platforms,in this paper we focus on the case of live video transcoding.Specifically,we investigate scheduling heuristics that decide on which jobs should be assigned to an edge minidatacenter and which to a backend datacenter.Through simulation experiments with different Qo S requirements we conclude on the best alternative.展开更多
The new H.264 video coding standard achieves significantly higher compression performance than MPEG-2. As the MPEG-2 is popular in digital TV, DVD, etc., bandwidth or memory space can be saved by transcoding those str...The new H.264 video coding standard achieves significantly higher compression performance than MPEG-2. As the MPEG-2 is popular in digital TV, DVD, etc., bandwidth or memory space can be saved by transcoding those streams into H.264 in these applications. Unfortunately, the huge complexity keeps transcoding from being widely used in practical applications. This paper proposes an efficient transcoding architecture with a smart downscaling decoder and a fast mode decision algorithm. Using the proposed architecture, huge buffering memory space is saved and the transcoding complexity is reduced. Performance of the proposed fast mode decision algorithm is validated by experiments.展开更多
In recent years,real-time video streaming has grown in popularity.The growing popularity of the Internet of Things(IoT)and other wireless heterogeneous networks mandates that network resources be carefully apportioned...In recent years,real-time video streaming has grown in popularity.The growing popularity of the Internet of Things(IoT)and other wireless heterogeneous networks mandates that network resources be carefully apportioned among versatile users in order to achieve the best Quality of Experience(QoE)and performance objectives.Most researchers focused on Forward Error Correction(FEC)techniques when attempting to strike a balance between QoE and performance.However,as network capacity increases,the performance degrades,impacting the live visual experience.Recently,Deep Learning(DL)algorithms have been successfully integrated with FEC to stream videos across multiple heterogeneous networks.But these algorithms need to be changed to make the experience better without sacrificing packet loss and delay time.To address the previous challenge,this paper proposes a novel intelligent algorithm that streams video in multi-home heterogeneous networks based on network-centric characteristics.The proposed framework contains modules such as Intelligent Content Extraction Module(ICEM),Channel Status Monitor(CSM),and Adaptive FEC(AFEC).This framework adopts the Cognitive Learning-based Scheduling(CLS)Module,which works on the deep Reinforced Gated Recurrent Networks(RGRN)principle and embeds them along with the FEC to achieve better performances.The complete framework was developed using the Objective Modular Network Testbed in C++(OMNET++),Internet networking(INET),and Python 3.10,with Keras as the front end and Tensorflow 2.10 as the back end.With extensive experimentation,the proposed model outperforms the other existing intelligentmodels in terms of improving the QoE,minimizing the End-to-End Delay(EED),and maintaining the highest accuracy(98%)and a lower Root Mean Square Error(RMSE)value of 0.001.展开更多
A novel bandwidth prediction and control scheme is proposed for video transmission over an ad boc network. The scheme is based on cross-layer, feedback, and Bayesian network techniques. The impacts of video quality ar...A novel bandwidth prediction and control scheme is proposed for video transmission over an ad boc network. The scheme is based on cross-layer, feedback, and Bayesian network techniques. The impacts of video quality are formulized and deduced. The relevant factors are obtained by a cross-layer mechanism or Feedback method. According to these relevant factors, the variable set and the Bayesian network topology are determined. Then a Bayesian network prediction model is constructed. The results of the prediction can be used as the bandwidth of the mobile ad hoc network (MANET). According to the bandwidth, the video encoder is controlled to dynamically adjust and encode the right bit rates of a real-time video stream. Integrated simulation of a video streaming communication system is implemented to validate the proposed solution. In contrast to the conventional transfer scheme, the results of the experiment indicate that the proposed scheme can make the best use of the network bandwidth; there are considerable improvements in the packet loss and the visual quality of real-time video.K展开更多
The scalable extension of H.264/AVC, known as scalable video coding or SVC, is currently the main focus of the Joint Video Team’s work. In its present working draft, the higher level syntax of SVC follows the design ...The scalable extension of H.264/AVC, known as scalable video coding or SVC, is currently the main focus of the Joint Video Team’s work. In its present working draft, the higher level syntax of SVC follows the design principles of H.264/AVC. Self-contained network abstraction layer units (NAL units) form natural entities for packetization. The SVC specification is by no means finalized yet, but nevertheless the work towards an optimized RTP payload format has already started. RFC 3984, the RTP payload specification for H.264/AVC has been taken as a starting point, but it became quickly clear that the scalable features of SVC require adaptation in at least the areas of capability/operation point signaling and documentation of the extended NAL unit header. This paper first gives an overview of the history of scalable video coding, and then reviews the video coding layer (VCL) and NAL of the latest SVC draft specification. Finally, it discusses different aspects of the draft SVC RTP payload format, in- cluding the design criteria, use cases, signaling and payload structure.展开更多
Abts ract A wireless mutl i-hop videot ransmission experiment system is designed and implemented for vehiculra ad-hoc networks VANET and the rt ansm ission control protocol and routing protocol are proposed. This syst...Abts ract A wireless mutl i-hop videot ransmission experiment system is designed and implemented for vehiculra ad-hoc networks VANET and the rt ansm ission control protocol and routing protocol are proposed. This system in tegrates the embedded Linux system witha n ARM kernel and oc ns ists of a S3C6410 main control module a wirel ss local arean etwork WLAN card a LCD screne and so on.In the scenario of a wireless multi-hop video transmission both the H.264 and JPEG are used and their performances such as the compression rate delay and frame loss rate are analyzed in theory andc ompared in the experiment.The system is tested in the real indoor and outdoor environment.The results show that the scheme of the multi-hop video transmission experiment system can be applicable for VANET and multiple scenes and the transmission control protocol and routing protocol proposed can achieve real-time transmission and meet multi-hop requirements.展开更多
Sports video appeals to large audiences due to its high commercial potentials. Automatically extracting useful semantic information and generating highlight summary from sports video to facilitate users' accessing...Sports video appeals to large audiences due to its high commercial potentials. Automatically extracting useful semantic information and generating highlight summary from sports video to facilitate users' accessing requirements is an important problem, especially in the forthcoming broadband mobile communication and the need for users to access their multimedia information of interest from anywhere at anytime with their most convenient digital equipments. A system to generate highlight summaries oriented for mobile applications is introduced, which includes highlight extraction and video adaptation. In this system, several highlight extraction techniques are provided for field sports video and racket sports video by using multi-modal information. To enhance users' viewing experience and save bandwidth, 3D animation from highlight segment is also generated. As an important procedure to make video analysis results universally applicable, video transcoding techniques are applied to adapt the video for mobile communication environment and user preference. Experimental results are encouraging and show the advantage and feasibility of the system for multimedia content personalization, enhancement and adaptation to meet different user preference and network/device requirements.展开更多
This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveill...This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveillance videos on demand through video streaming over mobile communication networks. The intelligent video analysis includes moving object detection/tracking and key frame selection which can browse useful video clips. The communication networking services, comprising video transcoding, multimedia messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction and particle filter tracking. Key frame selection, which aims to deliver an alarm to a mobile client using multimedia messaging service accompanied with an extracted clear frame, is reached by devising a weighted importance criterion considering object clarity and face appearance. Besides, a spatial- domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into the mobile video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, choose very appropriate key frames for users, and transcode the images with a high power signal-to-noise ratio (PSNR).展开更多
HTTP Adaptive Streaming(HAS)of video content is becoming an undivided part of the Internet and accounts for most of today’s network traffic.Video compression technology plays a vital role in efficiently utilizing net...HTTP Adaptive Streaming(HAS)of video content is becoming an undivided part of the Internet and accounts for most of today’s network traffic.Video compression technology plays a vital role in efficiently utilizing network channels,but encoding videos into multiple representations with selected encoding parameters is a significant challenge.However,video encoding is a computationally intensive and time-consuming operation that requires high-performance resources provided by on-premise infrastructures or public clouds.In turn,the public clouds,such as Amazon elastic compute cloud(EC2),provide hundreds of computing instances optimized for different purposes and clients’budgets.Thus,there is a need for algorithms and methods for optimized computing instance selection for specific tasks such as video encoding and transcoding operations.Additionally,the encoding speed directly depends on the selected encoding parameters and the complexity characteristics of video content.In this paper,we first benchmarked the video encoding performance of Amazon EC2 spot instances using multiple×264 codec encoding parameters and video sequences of varying complexity.Then,we proposed a novel fast approach to optimize Amazon EC2 spot instances and minimize video encoding costs.Furthermore,we evaluated how the optimized selection of EC2 spot instances can affect the encoding cost.The results show that our approach,on average,can reduce the encoding costs by at least 15.8%and up to 47.8%when compared to a random selection of EC2 spot instances.展开更多
360 video streaming services over the network are becoming popular. In particular, it is easy to experience 360 video through the already popular smartphone. However, due to the nature of 360 video, it is difficult to...360 video streaming services over the network are becoming popular. In particular, it is easy to experience 360 video through the already popular smartphone. However, due to the nature of 360 video, it is difficult to provide stable streaming service in general network environment because the size of data to send is larger than that of conventional video. Also, the real user's viewing area is very small compared to the sending amount. In this paper, we propose a system that can provide high quality 360 video streaming services to the users more efficiently in the cloud. In particular, we propose a streaming system focused on using a head mount display (HMD).展开更多
Automated live video stream analytics has been extensively researched in recent times.Most of the traditional methods for video anomaly detection is supervised and use a single classifier to identify an anomaly in a f...Automated live video stream analytics has been extensively researched in recent times.Most of the traditional methods for video anomaly detection is supervised and use a single classifier to identify an anomaly in a frame.We propose a 3-stage ensemble-based unsupervised deep reinforcement algorithm with an underlying Long Short Term Memory(LSTM)based Recurrent Neural Network(RNN).In the first stage,an ensemble of LSTM-RNNs are deployed to generate the anomaly score.The second stage uses the least square method for optimal anomaly score generation.The third stage adopts award-based reinforcement learning to update the model.The proposed Hybrid Ensemble RR Model was tested on standard pedestrian datasets UCSDPed1,USDPed2.The data set has 70 videos in UCSD Ped1 and 28 videos in UCSD Ped2 with a total of 18560 frames.Since a real-time stream has strict memory constraints and storage issues,a simple computing machine does not suffice in performing analytics with stream data.Hence the proposed research is designed to work on a GPU(Graphics Processing Unit),TPU(Tensor Processing Unit)supported framework.As shown in the experimental results section,recorded observations on framelevel EER(Equal Error Rate)and AUC(Area Under Curve)showed a 9%reduction in EER in UCSD Ped1,a 13%reduction in ERR in UCSD Ped2 and a 4%improvement in accuracy in both datasets.展开更多
An Augmented virtual environment(AVE)is concerned with the fusion of real-time video with 3D models or scenes so as to augment the virtual environment.In this paper,a new approach to establish an AVE with a wide field...An Augmented virtual environment(AVE)is concerned with the fusion of real-time video with 3D models or scenes so as to augment the virtual environment.In this paper,a new approach to establish an AVE with a wide field of view is proposed,including real-time video projection,multiple video texture fusion and 3D visualization of moving objects.A new diagonally weighted algorithm is proposed to smooth the apparent gaps within the overlapping area between the two adjacent videos.A visualization method for the location and trajectory of a moving virtual object is proposed to display the moving object and its trajectory in the 3D virtual environment.The experimental results showed that the proposed set of algorithms are able to fuse multiple real-time videos with 3D models efficiently,and the experiment runs a 3D scene containing two million triangles and six real-time videos at around 55 frames per second on a laptop with 1GB of graphics card memory.In addition,a realistic AVE with a wide field of view was created based on the Digital Earth Science Platform by fusing three videos with a complex indoor virtual scene,visualizing a moving object and drawing its trajectory in the real time.展开更多
Real-time variable bit rate(VBR) video is expected to take a significant portion of multimedia applications.However,plentiful challenges to VBR video service provision have been raised for its characteristic of high...Real-time variable bit rate(VBR) video is expected to take a significant portion of multimedia applications.However,plentiful challenges to VBR video service provision have been raised for its characteristic of high traffic abruptness.To support multi-user real-time VBR video transmission with high bandwidth utilization and satisfied quality of service(QoS),this article proposes a practical dynamic bandwidth management scheme.This scheme forecasts future media rate of VBR video by employing time-domain adaptive linear predictor and using media delivery index(MDI) as both QoS measurement and complementary management reference.In addition,to support multi-user application,an adjustment priorities classified strategy is also put forward.Finally,a test-bed based on this management scheme is established.The experimental results demonstrate that the scheme proposed in this article is efficient with bandwidth utilization increased by 20%-60% compared to a fixed service rate and QoS guaranteed.展开更多
基金supported in part by the Natural Science Foundation of Jiangsu Province under Grant BK20200486.
文摘Video transcoding is to create multiple representations of a video for content adaptation.It is deemed as a core technique in Adaptive BitRate(ABR)streaming.How to manage video transcoding affects the performance of ABR streaming in various aspects,including operational cost,streaming delays,Quality of Experience(QoE),etc.Therefore,the problems of implementing video transcoding in ABR streaming must be systematically studied to improve the overall performance of the streaming services.These problems become more worthy of investigation with the emergence of the edge-cloud continuum,which makes the resource allocation for video transcoding more complicated.To this end,this paper provides an investigation of the main technical problems related to video transcoding in ABR streaming,including designing a rate profile for video transcoding,providing resources for video transcoding in clouds,and caching multi-bitrate video contents in networks,etc.We analyze these problems from the perspective of resource allocation in the edge-cloud continuum and cast them into resource and Quality of Service(QoS)optimization problems.The goal is to minimize resource consumption while guaranteeing the QoS for ABR streaming.We also discuss some promising research directions for the ABR streaming services.
基金This work was supported by National Natural Science Foundation of China(No.61771070)National Natural Science Foundation of China(No.61671088).
文摘Adaptive bitrate video streaming(ABR)has become a critical technique for mobile video streaming to cope with time-varying network conditions and different user preferences.However,there are still many problems in achieving high-quality ABR video streaming over cellular networks.Mobile Edge Computing(MEC)is a promising paradigm to overcome the above problems by providing video transcoding capability and caching the ABR video streaming within the radio access network(RAN).In this paper,we propose a flexible transcoding strategy to provide viewers with low-latency video streaming services in the MEC networks under the limited storage,computing,and spectrum resources.According to the information collected from users,the MEC server acts as a controlling component to adjust the transcoding strategy flexibly based on optimizing the video caching placement strategy.Specifically,we cache the proper bitrate version of the video segments at the edge servers and select the appropriate bitrate version of the video segments to perform transcoding under jointly considering access control,resource allocation,and user preferences.We formulate this problem as a nonconvex optimization and mixed combinatorial problem.Moreover,the simulation results indicate that our proposed algorithm can ensure a low-latency viewing experience for users.
基金supported by the National Nature Science Foundation of China(NSFC 60622110,61471220,91538107,91638205)National Basic Research Project of China(973,2013CB329006),GY22016058
文摘With the popularity of smart handheld devices, mobile streaming video has multiplied the global network traffic in recent years. A huge concern of users' quality of experience(Qo E) has made rate adaptation methods very attractive. In this paper, we propose a two-phase rate adaptation strategy to improve users' real-time video Qo E. First, to measure and assess video Qo E, we provide a continuous Qo E prediction engine modeled by RNN recurrent neural network. Different from traditional Qo E models which consider the Qo E-aware factors separately or incompletely, our RNN-Qo E model accounts for three descriptive factors(video quality, rebuffering, and rate change) and reflects the impact of cognitive memory and recency. Besides, the video playing is separated into the initial startup phase and the steady playback phase, and we takes different optimization goals for each phase: the former aims at shortening the startup delay while the latter ameliorates the video quality and the rebufferings. Simulation results have shown that RNN-Qo E can follow the subjective Qo E quite well, and the proposed strategy can effectively reduce the occurrence of rebufferings caused by the mismatch between the requested video rates and the fluctuated throughput and attains standout performance on real-time Qo E compared with classical rate adaption methods.
文摘With the increasing popularity of solid sate lighting devices, Visible Light Communication (VLC) is globally recognized as an advanced and promising technology to realize short-range, high speed as well as large capacity wireless data transmission. In this paper, we propose a prototype of real-time audio and video broadcast system using inexpensive commercially available light emitting diode (LED) lamps. Experimental results show that real-time high quality audio and video with the maximum distance of 3 m can be achieved through proper layout of LED sources and improvement of concentration effects. Lighting model within room environment is designed and simulated which indicates close relationship between layout of light sources and distribution of illuminance.
文摘Efficient video delivery involves the transcoding of the original sequence into various resolutions,bitrates and standards,in order to match viewers’capabilities.Since video coding and transcoding are computationally demanding,performing a portion of these tasks at the network edges promises to decrease both the workload and network traffic towards the data centers of media providers.Motivated by the increasing popularity of live casting on social media platforms,in this paper we focus on the case of live video transcoding.Specifically,we investigate scheduling heuristics that decide on which jobs should be assigned to an edge minidatacenter and which to a backend datacenter.Through simulation experiments with different Qo S requirements we conclude on the best alternative.
基金Project (No. CNGI-04-15-2A) supported by the China Next Gen-eration Internet (CNGI)
文摘The new H.264 video coding standard achieves significantly higher compression performance than MPEG-2. As the MPEG-2 is popular in digital TV, DVD, etc., bandwidth or memory space can be saved by transcoding those streams into H.264 in these applications. Unfortunately, the huge complexity keeps transcoding from being widely used in practical applications. This paper proposes an efficient transcoding architecture with a smart downscaling decoder and a fast mode decision algorithm. Using the proposed architecture, huge buffering memory space is saved and the transcoding complexity is reduced. Performance of the proposed fast mode decision algorithm is validated by experiments.
文摘In recent years,real-time video streaming has grown in popularity.The growing popularity of the Internet of Things(IoT)and other wireless heterogeneous networks mandates that network resources be carefully apportioned among versatile users in order to achieve the best Quality of Experience(QoE)and performance objectives.Most researchers focused on Forward Error Correction(FEC)techniques when attempting to strike a balance between QoE and performance.However,as network capacity increases,the performance degrades,impacting the live visual experience.Recently,Deep Learning(DL)algorithms have been successfully integrated with FEC to stream videos across multiple heterogeneous networks.But these algorithms need to be changed to make the experience better without sacrificing packet loss and delay time.To address the previous challenge,this paper proposes a novel intelligent algorithm that streams video in multi-home heterogeneous networks based on network-centric characteristics.The proposed framework contains modules such as Intelligent Content Extraction Module(ICEM),Channel Status Monitor(CSM),and Adaptive FEC(AFEC).This framework adopts the Cognitive Learning-based Scheduling(CLS)Module,which works on the deep Reinforced Gated Recurrent Networks(RGRN)principle and embeds them along with the FEC to achieve better performances.The complete framework was developed using the Objective Modular Network Testbed in C++(OMNET++),Internet networking(INET),and Python 3.10,with Keras as the front end and Tensorflow 2.10 as the back end.With extensive experimentation,the proposed model outperforms the other existing intelligentmodels in terms of improving the QoE,minimizing the End-to-End Delay(EED),and maintaining the highest accuracy(98%)and a lower Root Mean Square Error(RMSE)value of 0.001.
基金The National High Technology Research and Development Program of China (863Program) (No.2003AA1Z2130)the Scienceand Technology Project of Zhejiang Province(No.2005C11001-02)
文摘A novel bandwidth prediction and control scheme is proposed for video transmission over an ad boc network. The scheme is based on cross-layer, feedback, and Bayesian network techniques. The impacts of video quality are formulized and deduced. The relevant factors are obtained by a cross-layer mechanism or Feedback method. According to these relevant factors, the variable set and the Bayesian network topology are determined. Then a Bayesian network prediction model is constructed. The results of the prediction can be used as the bandwidth of the mobile ad hoc network (MANET). According to the bandwidth, the video encoder is controlled to dynamically adjust and encode the right bit rates of a real-time video stream. Integrated simulation of a video streaming communication system is implemented to validate the proposed solution. In contrast to the conventional transfer scheme, the results of the experiment indicate that the proposed scheme can make the best use of the network bandwidth; there are considerable improvements in the packet loss and the visual quality of real-time video.K
文摘The scalable extension of H.264/AVC, known as scalable video coding or SVC, is currently the main focus of the Joint Video Team’s work. In its present working draft, the higher level syntax of SVC follows the design principles of H.264/AVC. Self-contained network abstraction layer units (NAL units) form natural entities for packetization. The SVC specification is by no means finalized yet, but nevertheless the work towards an optimized RTP payload format has already started. RFC 3984, the RTP payload specification for H.264/AVC has been taken as a starting point, but it became quickly clear that the scalable features of SVC require adaptation in at least the areas of capability/operation point signaling and documentation of the extended NAL unit header. This paper first gives an overview of the history of scalable video coding, and then reviews the video coding layer (VCL) and NAL of the latest SVC draft specification. Finally, it discusses different aspects of the draft SVC RTP payload format, in- cluding the design criteria, use cases, signaling and payload structure.
基金The National Natural Science Foundation of China(No.61201175,61171081)Transformation Program of Science and Technology Achievements of Jiangsu Province(No.BA2010023)
文摘Abts ract A wireless mutl i-hop videot ransmission experiment system is designed and implemented for vehiculra ad-hoc networks VANET and the rt ansm ission control protocol and routing protocol are proposed. This system in tegrates the embedded Linux system witha n ARM kernel and oc ns ists of a S3C6410 main control module a wirel ss local arean etwork WLAN card a LCD screne and so on.In the scenario of a wireless multi-hop video transmission both the H.264 and JPEG are used and their performances such as the compression rate delay and frame loss rate are analyzed in theory andc ompared in the experiment.The system is tested in the real indoor and outdoor environment.The results show that the scheme of the multi-hop video transmission experiment system can be applicable for VANET and multiple scenes and the transmission control protocol and routing protocol proposed can achieve real-time transmission and meet multi-hop requirements.
基金Project supported by NEC Research of China (No. 0P2004001),"Science 100 Plan" of the Chinese Academy of Sciences (No. m2041),and the Natural Science Foundation (No. 4063041) of Beijing, China
文摘Sports video appeals to large audiences due to its high commercial potentials. Automatically extracting useful semantic information and generating highlight summary from sports video to facilitate users' accessing requirements is an important problem, especially in the forthcoming broadband mobile communication and the need for users to access their multimedia information of interest from anywhere at anytime with their most convenient digital equipments. A system to generate highlight summaries oriented for mobile applications is introduced, which includes highlight extraction and video adaptation. In this system, several highlight extraction techniques are provided for field sports video and racket sports video by using multi-modal information. To enhance users' viewing experience and save bandwidth, 3D animation from highlight segment is also generated. As an important procedure to make video analysis results universally applicable, video transcoding techniques are applied to adapt the video for mobile communication environment and user preference. Experimental results are encouraging and show the advantage and feasibility of the system for multimedia content personalization, enhancement and adaptation to meet different user preference and network/device requirements.
文摘This paper proposes a mobile video surveillance system consisting of intelligent video analysis and mobile communication networking. This multilevel distillation approach helps mobile users monitor tremendous surveillance videos on demand through video streaming over mobile communication networks. The intelligent video analysis includes moving object detection/tracking and key frame selection which can browse useful video clips. The communication networking services, comprising video transcoding, multimedia messaging, and mobile video streaming, transmit surveillance information into mobile appliances. Moving object detection is achieved by background subtraction and particle filter tracking. Key frame selection, which aims to deliver an alarm to a mobile client using multimedia messaging service accompanied with an extracted clear frame, is reached by devising a weighted importance criterion considering object clarity and face appearance. Besides, a spatial- domain cascaded transcoder is developed to convert the filtered image sequence of detected objects into the mobile video streaming format. Experimental results show that the system can successfully detect all events of moving objects for a complex surveillance scene, choose very appropriate key frames for users, and transcode the images with a high power signal-to-noise ratio (PSNR).
基金This work has been supported in part by the Austrian Research Promotion Agency(FFG)under the APOLLO and Karnten Fog project.
文摘HTTP Adaptive Streaming(HAS)of video content is becoming an undivided part of the Internet and accounts for most of today’s network traffic.Video compression technology plays a vital role in efficiently utilizing network channels,but encoding videos into multiple representations with selected encoding parameters is a significant challenge.However,video encoding is a computationally intensive and time-consuming operation that requires high-performance resources provided by on-premise infrastructures or public clouds.In turn,the public clouds,such as Amazon elastic compute cloud(EC2),provide hundreds of computing instances optimized for different purposes and clients’budgets.Thus,there is a need for algorithms and methods for optimized computing instance selection for specific tasks such as video encoding and transcoding operations.Additionally,the encoding speed directly depends on the selected encoding parameters and the complexity characteristics of video content.In this paper,we first benchmarked the video encoding performance of Amazon EC2 spot instances using multiple×264 codec encoding parameters and video sequences of varying complexity.Then,we proposed a novel fast approach to optimize Amazon EC2 spot instances and minimize video encoding costs.Furthermore,we evaluated how the optimized selection of EC2 spot instances can affect the encoding cost.The results show that our approach,on average,can reduce the encoding costs by at least 15.8%and up to 47.8%when compared to a random selection of EC2 spot instances.
文摘360 video streaming services over the network are becoming popular. In particular, it is easy to experience 360 video through the already popular smartphone. However, due to the nature of 360 video, it is difficult to provide stable streaming service in general network environment because the size of data to send is larger than that of conventional video. Also, the real user's viewing area is very small compared to the sending amount. In this paper, we propose a system that can provide high quality 360 video streaming services to the users more efficiently in the cloud. In particular, we propose a streaming system focused on using a head mount display (HMD).
文摘Automated live video stream analytics has been extensively researched in recent times.Most of the traditional methods for video anomaly detection is supervised and use a single classifier to identify an anomaly in a frame.We propose a 3-stage ensemble-based unsupervised deep reinforcement algorithm with an underlying Long Short Term Memory(LSTM)based Recurrent Neural Network(RNN).In the first stage,an ensemble of LSTM-RNNs are deployed to generate the anomaly score.The second stage uses the least square method for optimal anomaly score generation.The third stage adopts award-based reinforcement learning to update the model.The proposed Hybrid Ensemble RR Model was tested on standard pedestrian datasets UCSDPed1,USDPed2.The data set has 70 videos in UCSD Ped1 and 28 videos in UCSD Ped2 with a total of 18560 frames.Since a real-time stream has strict memory constraints and storage issues,a simple computing machine does not suffice in performing analytics with stream data.Hence the proposed research is designed to work on a GPU(Graphics Processing Unit),TPU(Tensor Processing Unit)supported framework.As shown in the experimental results section,recorded observations on framelevel EER(Equal Error Rate)and AUC(Area Under Curve)showed a 9%reduction in EER in UCSD Ped1,a 13%reduction in ERR in UCSD Ped2 and a 4%improvement in accuracy in both datasets.
基金Research presented in this paper was funded by the National Key Research and Development Program of China[grant numbers 2016YFB0501503 and 2016YFB0501502]Hainan Provincial Department of Science and Technology[grant number ZDKJ2016021].
文摘An Augmented virtual environment(AVE)is concerned with the fusion of real-time video with 3D models or scenes so as to augment the virtual environment.In this paper,a new approach to establish an AVE with a wide field of view is proposed,including real-time video projection,multiple video texture fusion and 3D visualization of moving objects.A new diagonally weighted algorithm is proposed to smooth the apparent gaps within the overlapping area between the two adjacent videos.A visualization method for the location and trajectory of a moving virtual object is proposed to display the moving object and its trajectory in the 3D virtual environment.The experimental results showed that the proposed set of algorithms are able to fuse multiple real-time videos with 3D models efficiently,and the experiment runs a 3D scene containing two million triangles and six real-time videos at around 55 frames per second on a laptop with 1GB of graphics card memory.In addition,a realistic AVE with a wide field of view was created based on the Digital Earth Science Platform by fusing three videos with a complex indoor virtual scene,visualizing a moving object and drawing its trajectory in the real time.
基金supported by the National Basic Research Program of China (2007CB310705)the National Natural Science Foundation of China (60772024, 60711140087)+4 种基金the Hi-Tech Research and Development Program of China (2007AA01Z255)the NCET (06-0090)the PCSIRT (IRT0609)the ISTCP (2006DFA11040)the 111 Project of China (B07005)
文摘Real-time variable bit rate(VBR) video is expected to take a significant portion of multimedia applications.However,plentiful challenges to VBR video service provision have been raised for its characteristic of high traffic abruptness.To support multi-user real-time VBR video transmission with high bandwidth utilization and satisfied quality of service(QoS),this article proposes a practical dynamic bandwidth management scheme.This scheme forecasts future media rate of VBR video by employing time-domain adaptive linear predictor and using media delivery index(MDI) as both QoS measurement and complementary management reference.In addition,to support multi-user application,an adjustment priorities classified strategy is also put forward.Finally,a test-bed based on this management scheme is established.The experimental results demonstrate that the scheme proposed in this article is efficient with bandwidth utilization increased by 20%-60% compared to a fixed service rate and QoS guaranteed.