期刊文献+
共找到97,432篇文章
< 1 2 250 >
每页显示 20 50 100
Trends in Event Understanding and Caption Generation/Reconstruction in Dense Video:A Review
1
作者 Ekanayake Mudiyanselage Chulabhaya Lankanatha Ekanayake Abubakar Sulaiman Gezawa Yunqi Lei 《Computers, Materials & Continua》 SCIE EI 2024年第3期2941-2965,共25页
Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It... Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It is also playing an essential role in devolving human-robot interaction.The dense video description is more difficult when compared with simple Video captioning because of the object’s interactions and event overlapping.Deep learning is changing the shape of computer vision(CV)technologies and natural language processing(NLP).There are hundreds of deep learning models,datasets,and evaluations that can improve the gaps in current research.This article filled this gap by evaluating some state-of-the-art approaches,especially focusing on deep learning and machine learning for video caption in a dense environment.In this article,some classic techniques concerning the existing machine learning were reviewed.And provides deep learning models,a detail of benchmark datasets with their respective domains.This paper reviews various evaluation metrics,including Bilingual EvaluationUnderstudy(BLEU),Metric for Evaluation of Translation with Explicit Ordering(METEOR),WordMover’s Distance(WMD),and Recall-Oriented Understudy for Gisting Evaluation(ROUGE)with their pros and cons.Finally,this article listed some future directions and proposed work for context enhancement using key scene extraction with object detection in a particular frame.Especially,how to improve the context of video description by analyzing key frames detection through morphological image analysis.Additionally,the paper discusses a novel approach involving sentence reconstruction and context improvement through key frame object detection,which incorporates the fusion of large languagemodels for refining results.The ultimate results arise fromenhancing the generated text of the proposedmodel by improving the predicted text and isolating objects using various keyframes.These keyframes identify dense events occurring in the video sequence. 展开更多
关键词 video description video to text video caption sentence reconstruction
下载PDF
Cloud‐based video streaming services:Trends,challenges,and opportunities
2
作者 Tajinder Kumar Purushottam Sharma +5 位作者 Jaswinder Tanwar Hisham Alsghier Shashi Bhushan Hesham Alhumyani Vivek Sharma Ahmed I.Alutaibi 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第2期265-285,共21页
Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the... Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the technical and business issues surrounding cloudbased live streaming is provided,including the benefits of cloud computing,the various live streaming architectures,and the challenges that live streaming service providers face in delivering high‐quality,real‐time services.The different techniques used to improve the performance of video streaming,such as adaptive bit‐rate streaming,multicast distribution,and edge computing are discussed and the necessity of low‐latency and high‐quality video transmission in cloud‐based live streaming is underlined.Issues such as improving user experience and live streaming service performance using cutting‐edge technology,like artificial intelligence and machine learning are discussed.In addition,the legal and regulatory implications of cloud‐based live streaming,including issues with network neutrality,data privacy,and content moderation are addressed.The future of cloud computing for live streaming is examined in the section that follows,and it looks at the most likely new developments in terms of trends and technology.For technology vendors,live streaming service providers,and regulators,the findings have major policy‐relevant implications.Suggestions on how stakeholders should address these concerns and take advantage of the potential presented by this rapidly evolving sector,as well as insights into the key challenges and opportunities associated with cloud‐based live streaming are provided. 展开更多
关键词 cloud computing video analysis video coding
下载PDF
A HEVC Video Steganalysis Method Using the Optimality of Motion Vector Prediction
3
作者 Jun Li Minqing Zhang +2 位作者 Ke Niu Yingnan Zhang Xiaoyuan Yang 《Computers, Materials & Continua》 SCIE EI 2024年第5期2085-2103,共19页
Among steganalysis techniques,detection against MV(motion vector)domain-based video steganography in the HEVC(High Efficiency Video Coding)standard remains a challenging issue.For the purpose of improving the detectio... Among steganalysis techniques,detection against MV(motion vector)domain-based video steganography in the HEVC(High Efficiency Video Coding)standard remains a challenging issue.For the purpose of improving the detection performance,this paper proposes a steganalysis method that can perfectly detectMV-based steganography in HEVC.Firstly,we define the local optimality of MVP(Motion Vector Prediction)based on the technology of AMVP(Advanced Motion Vector Prediction).Secondly,we analyze that in HEVC video,message embedding either usingMVP index orMVD(Motion Vector Difference)may destroy the above optimality of MVP.And then,we define the optimal rate of MVP as a steganalysis feature.Finally,we conduct steganalysis detection experiments on two general datasets for three popular steganographymethods and compare the performance with four state-ofthe-art steganalysis methods.The experimental results demonstrate the effectiveness of the proposed feature set.Furthermore,our method stands out for its practical applicability,requiring no model training and exhibiting low computational complexity,making it a viable solution for real-world scenarios. 展开更多
关键词 video steganography video steganalysis motion vector prediction motion vector difference advanced motion vector prediction local optimality
下载PDF
Customized Convolutional Neural Network for Accurate Detection of Deep Fake Images in Video Collections
4
作者 Dmitry Gura Bo Dong +1 位作者 Duaa Mehiar Nidal Al Said 《Computers, Materials & Continua》 SCIE EI 2024年第5期1995-2014,共20页
The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method in... The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method involves extracting structured data from video frames using facial landmark detection,which is then used as input to the CNN.The customized Convolutional Neural Network method is the date augmented-based CNN model to generate‘fake data’or‘fake images’.This study was carried out using Python and its libraries.We used 242 films from the dataset gathered by the Deep Fake Detection Challenge,of which 199 were made up and the remaining 53 were real.Ten seconds were allotted for each video.There were 318 videos used in all,199 of which were fake and 119 of which were real.Our proposedmethod achieved a testing accuracy of 91.47%,loss of 0.342,and AUC score of 0.92,outperforming two alternative approaches,CNN and MLP-CNN.Furthermore,our method succeeded in greater accuracy than contemporary models such as XceptionNet,Meso-4,EfficientNet-BO,MesoInception-4,VGG-16,and DST-Net.The novelty of this investigation is the development of a new Convolutional Neural Network(CNN)learning model that can accurately detect deep fake face photos. 展开更多
关键词 Deep fake detection video analysis convolutional neural network machine learning video dataset collection facial landmark prediction accuracy models
下载PDF
Research on Demand Response Potential of Adjustable Loads in Demand Response Scenarios
5
作者 Zhishuo Zhang Xinhui Du +3 位作者 Yaoke Shang Jingshu Zhang Wei Zhao Jia Su 《Energy Engineering》 EI 2024年第6期1577-1605,共29页
To address the issues of limited demand response data,low generalization of demand response potential evaluation,and poor demand response effect,the article proposes a demand response potential feature extraction and ... To address the issues of limited demand response data,low generalization of demand response potential evaluation,and poor demand response effect,the article proposes a demand response potential feature extraction and prediction model based on data mining and a demand response potential assessment model for adjustable loads in demand response scenarios based on subjective and objective weight analysis.Firstly,based on the demand response process and demand response behavior,obtain demand response characteristics that characterize the process and behavior.Secondly,establish a feature extraction and prediction model based on data mining,including similar day clustering,time series decomposition,redundancy processing,and data prediction.The predicted values of each demand response feature on the response day are obtained.Thirdly,the predicted data of various characteristics on the response day are used as demand response potential evaluation indicators to represent different demand response scenarios and adjustable loads,and a demand response potential evaluation model based on subjective and objective weight allocation is established to calculate the demand response potential of different adjustable loads in different demand response scenarios.Finally,the effectiveness of the method proposed in the article is verified through examples,providing a reference for load aggregators to formulate demand response schemes. 展开更多
关键词 demand response potential demand response scenarios data mining adjustable load evaluation system subjective and objective weight allocation
下载PDF
A Combination Prediction Model for Short Term Travel Demand of Urban Taxi
6
作者 Mingyuan Li Yuanli Gu +1 位作者 Qingqiao Geng Hongru Yu 《Computers, Materials & Continua》 SCIE EI 2024年第6期3877-3896,共20页
This study proposes a prediction model considering external weather and holiday factors to address the issue of accurately predicting urban taxi travel demand caused by complex data and numerous influencing factors.Th... This study proposes a prediction model considering external weather and holiday factors to address the issue of accurately predicting urban taxi travel demand caused by complex data and numerous influencing factors.The model integrates the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise(CEEMDAN)and Convolutional Long Short Term Memory Neural Network(ConvLSTM)to predict short-term taxi travel demand.The CEEMDAN decomposition method effectively decomposes time series data into a set of modal components,capturing sequence characteristics at different time scales and frequencies.Based on the sample entropy value of components,secondary processing of more complex sequence components after decomposition is employed to reduce the cumulative prediction error of component sequences and improve prediction efficiency.On this basis,considering the correlation between the spatiotemporal trends of short-term taxi traffic,a ConvLSTM neural network model with Long Short Term Memory(LSTM)time series processing ability and Convolutional Neural Networks(CNN)spatial feature processing ability is constructed to predict the travel demand for urban taxis.The combined prediction model is tested on a taxi travel demand dataset in a certain area of Beijing.The results show that the CEEMDAN-ConvLSTM prediction model outperforms the LSTM,Autoregressive Integrated Moving Average model(ARIMA),CNN,and ConvLSTM benchmark models in terms of Symmetric Mean Absolute Percentage Error(SMAPE),Root Mean Square Error(RMSE),Mean Absolute Error(MAE),and R2 metrics.Notably,the SMAPE metric exhibits a remarkable decline of 21.03%with the utilization of our proposed model.These results confirm that our study provides a highly accurate and valid model for taxi travel demand forecasting. 展开更多
关键词 Urban transport taxi travel demand prediction CEEMDAN-ConvLSTM modal components
下载PDF
Stochastic programming based coordinated expansion planning of generation,transmission,demand side resources,and energy storage considering the DC transmission system
7
作者 Liang Lu Mingkui Wei +4 位作者 Yuxuan Tao Qing Wang Yuxiao Yang Chuan He Haonan Zhang 《Global Energy Interconnection》 EI CSCD 2024年第1期25-37,共13页
With the increasing penetration of wind and solar energies,the accompanying uncertainty that propagates in the system places higher requirements on the expansion planning of power systems.A source-grid-load-storage co... With the increasing penetration of wind and solar energies,the accompanying uncertainty that propagates in the system places higher requirements on the expansion planning of power systems.A source-grid-load-storage coordinated expansion planning model based on stochastic programming was proposed to suppress the impact of wind and solar energy fluctuations.Multiple types of system components,including demand response service entities,converter stations,DC transmission systems,cascade hydropower stations,and other traditional components,have been extensively modeled.Moreover,energy storage systems are considered to improve the accommodation level of renewable energy and alleviate the influence of intermittence.Demand-response service entities from the load side are used to reduce and move the demand during peak load periods.The uncertainties in wind,solar energy,and loads were simulated using stochastic programming.Finally,the effectiveness of the proposed model is verified through numerical simulations. 展开更多
关键词 Hydro-wind-solar complementary Expansion planning demand response Energy storage system Source-network-demand-storage coordination
下载PDF
Optimal dispatching strategy for residential demand response considering load participation
8
作者 Xiaoyu Zhou Xiaofeng Liu +2 位作者 Huai Liu Zhenya Ji Feng Li 《Global Energy Interconnection》 EI CSCD 2024年第1期38-47,共10页
To facilitate the coordinated and large-scale participation of residential flexible loads in demand response(DR),a load aggregator(LA)can integrate these loads for scheduling.In this study,a residential DR optimizatio... To facilitate the coordinated and large-scale participation of residential flexible loads in demand response(DR),a load aggregator(LA)can integrate these loads for scheduling.In this study,a residential DR optimization scheduling strategy was formulated considering the participation of flexible loads in DR.First,based on the operational characteristics of flexible loads such as electric vehicles,air conditioners,and dishwashers,their DR participation,the base to calculate the compensation price to users,was determined by considering these loads as virtual energy storage.It was quantified based on the state of virtual energy storage during each time slot.Second,flexible loads were clustered using the K-means algorithm,considering the typical operational and behavioral characteristics as the cluster centroid.Finally,the LA scheduling strategy was implemented by introducing a DR mechanism based on the directrix load.The simulation results demonstrate that the proposed DR approach can effectively reduce peak loads and fill valleys,thereby improving the load management performance. 展开更多
关键词 Residential demand response Flexible loads Load participation Load aggregator
下载PDF
Problematic Use of Video Games in Schools in Northern Benin (2023)
9
作者 Ireti Nethania Elie Ataigba David Sinet Koivogui +6 位作者 Damega Wenkourama Marcos Tohou Eurydice Elvire Djossou Anselme Djidonou Francis Tognon Tchegnonsi Prosper Gandaho Josiane Ezin Houngbe 《Open Journal of Psychiatry》 2024年第2期120-141,共22页
Objective: To study the problematic use of video games among secondary school students in the city of Parakou in 2023. Methods: Descriptive cross-sectional study conducted in the commune of Parakou from December 2022 ... Objective: To study the problematic use of video games among secondary school students in the city of Parakou in 2023. Methods: Descriptive cross-sectional study conducted in the commune of Parakou from December 2022 to July 2023. The study population consisted of students regularly enrolled in public and private secondary schools in the city of Parakou for the 2022-2023 academic year. A two-stage non-proportional stratified sampling technique combined with simple random sampling was adopted. The Problem Video Game Playing (PVP) scale was used to assess problem gambling in the study population, while anxiety and depression were assessed using the Hospital Anxiety and Depression Scale (HADS). Results: A total of 1030 students were included. The mean age of the pupils surveyed was 15.06 ± 2.68 years, with extremes of 10 and 28 years. The [13 - 18] age group was the most represented, with a proportion of 59.6% (614) in the general population. Females predominated, at 52.8% (544), with a sex ratio of 0.89. The prevalence of problematic video game use was 24.9%, measured using the Video Game Playing scale. Associated factors were male gender (p = 0.005), pocket money under 10,000 cfa (p = 0.001) and between 20,000 - 90,000 cfa (p = 0.030), addictive family behavior (p < 0.001), monogamous family (p = 0.023), good relationship with father (p = 0.020), organization of video game competitions (p = 0.001) and definite anxiety (p Conclusion: Substance-free addiction is struggling to attract the attention it deserves, as it did in its infancy everywhere else. This study complements existing data and serves as a reminder of the need to focus on this group of addictions, whose problematic use of video games remains the most frequent due to its accessibility and social tolerance. Preventive action combined with curative measures remains the most effective means of combating the problem at national level. 展开更多
关键词 Gaming Problem video Games BENIN 2023
下载PDF
A Video Captioning Method by Semantic Topic-Guided Generation
10
作者 Ou Ye Xinli Wei +2 位作者 Zhenhua Yu Yan Fu Ying Yang 《Computers, Materials & Continua》 SCIE EI 2024年第1期1071-1093,共23页
In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is de... In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video contents.To address this issue,this paper proposes a video captioning method by semantic topic-guided generation.First,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the encoding.Then,the semantic topics of video data are extracted using the visual labels retrieved from similar video data.In the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video contents.During this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted words.Finally,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text datasets.The experimental results demonstrate that the proposed method outperforms several state-of-art approaches.Specifically,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset,respectively,compared with the existing video captioning methods.As a result,the proposed method can generate video captioning that is more closely aligned with human natural language expression habits. 展开更多
关键词 video captioning encoder-decoder semantic topic jointly decoding Enhance-TopK sampling
下载PDF
Video Summarization Approach Based on Binary Robust Invariant Scalable Keypoints and Bisecting K-Means
11
作者 Sameh Zarif Eman Morad +3 位作者 Khalid Amin Abdullah Alharbi Wail S.Elkilani Shouze Tang 《Computers, Materials & Continua》 SCIE EI 2024年第3期3565-3583,共19页
Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract ... Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization.This method resulted in rapid exploration,indexing,and retrieval of massive video libraries.We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint(BRISK)and bisecting K-means clustering algorithm.The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the descriptors from video sequences.The video frames’BRISK features are clustered using a bisecting K-means,and the keyframe is determined by selecting the frame that is most near the cluster center.Without applying any clustering parameters,the appropriate clusters number is determined using the silhouette coefficient.Experiments were carried out on a publicly available open video project(OVP)dataset that contained videos of different genres.The proposed method’s effectiveness is compared to existing methods using a variety of evaluation metrics,and the proposed method achieves a trade-off between computational cost and quality. 展开更多
关键词 BRISK bisecting K-mean video summarization keyframe extraction shot detection
下载PDF
Optimal Bidding Strategies of Microgrid with Demand Side Management for Economic Emission Dispatch Incorporating Uncertainty and Outage of Renewable Energy Sources
12
作者 Mousumi Basu Chitralekha Jena +1 位作者 Baseem Khan Ahmed Ali 《Energy Engineering》 EI 2024年第4期849-867,共19页
In the restructured electricity market,microgrid(MG),with the incorporation of smart grid technologies,distributed energy resources(DERs),a pumped-storage-hydraulic(PSH)unit,and a demand response program(DRP),is a sma... In the restructured electricity market,microgrid(MG),with the incorporation of smart grid technologies,distributed energy resources(DERs),a pumped-storage-hydraulic(PSH)unit,and a demand response program(DRP),is a smarter and more reliable electricity provider.DER consists of gas turbines and renewable energy sources such as photovoltaic systems and wind turbines.Better bidding strategies,prepared by MG operators,decrease the electricity cost and emissions from upstream grid and conventional and renewable energy sources(RES).But it is inefficient due to the very high sporadic characteristics of RES and the very high outage rate.To solve these issues,this study suggests non-dominated sorting genetic algorithm Ⅱ(NSGA-Ⅱ)for an optimal bidding strategy considering pumped hydroelectric energy storage and DRP based on outage conditions and uncertainties of renewable energy sources.The uncertainty related to solar and wind units is modeled using lognormal and Weibull probability distributions.TOU-based DRP is used,especially considering the time of outages along with the time of peak loads and prices,to enhance the reliability of MG and reduce costs and emissions. 展开更多
关键词 MICRO-GRID distributed energy resources demand response program UNCERTAINTY OUTAGE
下载PDF
Workout Action Recognition in Video Streams Using an Attention Driven Residual DC-GRU Network
13
作者 Arnab Dey Samit Biswas Dac-Nhuong Le 《Computers, Materials & Continua》 SCIE EI 2024年第5期3067-3087,共21页
Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions i... Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd dataset serves as a catalyst for the development ofmore robust and effective fitnesstracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis. 展开更多
关键词 Workout action recognition video stream action recognition residual network GRU ATTENTION
下载PDF
A Novel Defender-Attacker-Defender Model for Resilient Distributed Generator Planning with Network Reconfiguration and Demand Response
14
作者 Wenlu Ji Teng Tu Nan Ma 《Energy Engineering》 EI 2024年第5期1223-1243,共21页
To improve the resilience of a distribution system against extreme weather,a fuel-based distributed generator(DG)allocation model is proposed in this study.In this model,the DGs are placed at the planning stage.When a... To improve the resilience of a distribution system against extreme weather,a fuel-based distributed generator(DG)allocation model is proposed in this study.In this model,the DGs are placed at the planning stage.When an extreme event occurs,the controllable generators form temporary microgrids(MGs)to restore the load maximally.Simultaneously,a demand response program(DRP)mitigates the imbalance between the power supply and demand during extreme events.To cope with the fault uncertainty,a robust optimization(RO)method is applied to reduce the long-term investment and short-term operation costs.The optimization is formulated as a tri-level defenderattacker-defender(DAD)framework.At the first level,decision-makers work out the DG allocation scheme;at the second level,the attacker finds the optimal attack strategy with maximum damage;and at the third level,restoration measures,namely distribution network reconfiguration(DNR)and demand response are performed.The problem is solved by the nested column and constraint generation(NC&CG)method and the model is validated using an IEEE 33-node system.Case studies validate the effectiveness and superiority of the proposed model according to the enhanced resilience and reduced cost. 展开更多
关键词 Distribution system RESILIENCE defender-attacker-defender distributed generator demand response microgrids formation
下载PDF
Video caching and scheduling with edge cooperation
15
作者 Zhidu Li Fuxiang Li +2 位作者 Tong Tang Hong Zhang Jin Yang 《Digital Communications and Networks》 SCIE CSCD 2024年第2期450-460,共11页
In this paper,we explore a distributed collaborative caching and computing model to support the distribution of adaptive bit rate video streaming.The aim is to reduce the average initial buffer delay and improve the q... In this paper,we explore a distributed collaborative caching and computing model to support the distribution of adaptive bit rate video streaming.The aim is to reduce the average initial buffer delay and improve the quality of user experience.Considering the difference between global and local video popularities and the time-varying characteristics of video popularity,a two-stage caching scheme is proposed to push popular videos closer to users and minimize the average initial buffer delay.Based on both long-term content popularity and short-term content popularity,the proposed caching solution is decouple into the proactive cache stage and the cache update stage.In the proactive cache stage,we develop a proactive cache placement algorithm that can be executed in an off-peak period.In the cache update stage,we propose a reactive cache update algorithm to update the existing cache policy to minimize the buffer delay.Simulation results verify that the proposed caching algorithms can reduce the initial buffer delay efficiently. 展开更多
关键词 video service Distributed and collaborative caching Long-term popularity Short-term popularity
下载PDF
Multi-Stream Temporally Enhanced Network for Video Salient Object Detection
16
作者 Dan Xu Jiale Ru Jinlong Shi 《Computers, Materials & Continua》 SCIE EI 2024年第1期85-104,共20页
Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing com... Video salient object detection(VSOD)aims at locating the most attractive objects in a video by exploring the spatial and temporal features.VSOD poses a challenging task in computer vision,as it involves processing complex spatial data that is also influenced by temporal dynamics.Despite the progress made in existing VSOD models,they still struggle in scenes of great background diversity within and between frames.Additionally,they encounter difficulties related to accumulated noise and high time consumption during the extraction of temporal features over a long-term duration.We propose a multi-stream temporal enhanced network(MSTENet)to address these problems.It investigates saliency cues collaboration in the spatial domain with a multi-stream structure to deal with the great background diversity challenge.A straightforward,yet efficient approach for temporal feature extraction is developed to avoid the accumulative noises and reduce time consumption.The distinction between MSTENet and other VSOD methods stems from its incorporation of both foreground supervision and background supervision,facilitating enhanced extraction of collaborative saliency cues.Another notable differentiation is the innovative integration of spatial and temporal features,wherein the temporal module is integrated into the multi-stream structure,enabling comprehensive spatial-temporal interactions within an end-to-end framework.Extensive experimental results demonstrate that the proposed method achieves state-of-the-art performance on five benchmark datasets while maintaining a real-time speed of 27 fps(Titan XP).Our code and models are available at https://github.com/RuJiaLe/MSTENet. 展开更多
关键词 video salient object detection deep learning temporally enhanced foreground-background collaboration
下载PDF
A Hybrid Machine Learning Approach for Improvised QoE in Video Services over 5G Wireless Networks
17
作者 K.B.Ajeyprasaath P.Vetrivelan 《Computers, Materials & Continua》 SCIE EI 2024年第3期3195-3213,共19页
Video streaming applications have grown considerably in recent years.As a result,this becomes one of the most significant contributors to global internet traffic.According to recent studies,the telecommunications indu... Video streaming applications have grown considerably in recent years.As a result,this becomes one of the most significant contributors to global internet traffic.According to recent studies,the telecommunications industry loses millions of dollars due to poor video Quality of Experience(QoE)for users.Among the standard proposals for standardizing the quality of video streaming over internet service providers(ISPs)is the Mean Opinion Score(MOS).However,the accurate finding of QoE by MOS is subjective and laborious,and it varies depending on the user.A fully automated data analytics framework is required to reduce the inter-operator variability characteristic in QoE assessment.This work addresses this concern by suggesting a novel hybrid XGBStackQoE analytical model using a two-level layering technique.Level one combines multiple Machine Learning(ML)models via a layer one Hybrid XGBStackQoE-model.Individual ML models at level one are trained using the entire training data set.The level two Hybrid XGBStackQoE-Model is fitted using the outputs(meta-features)of the layer one ML models.The proposed model outperformed the conventional models,with an accuracy improvement of 4 to 5 percent,which is still higher than the current traditional models.The proposed framework could significantly improve video QoE accuracy. 展开更多
关键词 Hybrid XGBStackQoE-model machine learning MOS performance metrics QOE 5G video services
下载PDF
Generative Multi-Modal Mutual Enhancement Video Semantic Communications
18
作者 Yuanle Chen Haobo Wang +3 位作者 Chunyu Liu Linyi Wang Jiaxin Liu Wei Wu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第6期2985-3009,共25页
Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the... Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the research and applications of natural language processing across different modalities,our goal is to accurately extract frame-level semantic information from videos and ultimately transmit high-quality videos.Specifically,we propose a deep learning-basedMulti-ModalMutual Enhancement Video Semantic Communication system,called M3E-VSC.Built upon a VectorQuantized Generative AdversarialNetwork(VQGAN),our systemaims to leverage mutual enhancement among different modalities by using text as the main carrier of transmission.With it,the semantic information can be extracted fromkey-frame images and audio of the video and performdifferential value to ensure that the extracted text conveys accurate semantic information with fewer bits,thus improving the capacity of the system.Furthermore,a multi-frame semantic detection module is designed to facilitate semantic transitions during video generation.Simulation results demonstrate that our proposed model maintains high robustness in complex noise environments,particularly in low signal-to-noise ratio conditions,significantly improving the accuracy and speed of semantic transmission in video communication by approximately 50 percent. 展开更多
关键词 Generative adversarial networks multi-modal mutual enhancement video semantic transmission deep learning
下载PDF
Improving Video Watermarking through Galois Field GF(2^(4)) Multiplication Tables with Diverse Irreducible Polynomials and Adaptive Techniques
19
作者 Yasmin Alaa Hassan Abdul Monem S.Rahma 《Computers, Materials & Continua》 SCIE EI 2024年第1期1423-1442,共20页
Video watermarking plays a crucial role in protecting intellectual property rights and ensuring content authenticity.This study delves into the integration of Galois Field(GF)multiplication tables,especially GF(2^(4))... Video watermarking plays a crucial role in protecting intellectual property rights and ensuring content authenticity.This study delves into the integration of Galois Field(GF)multiplication tables,especially GF(2^(4)),and their interaction with distinct irreducible polynomials.The primary aim is to enhance watermarking techniques for achieving imperceptibility,robustness,and efficient execution time.The research employs scene selection and adaptive thresholding techniques to streamline the watermarking process.Scene selection is used strategically to embed watermarks in the most vital frames of the video,while adaptive thresholding methods ensure that the watermarking process adheres to imperceptibility criteria,maintaining the video's visual quality.Concurrently,careful consideration is given to execution time,crucial in real-world scenarios,to balance efficiency and efficacy.The Peak Signal-to-Noise Ratio(PSNR)serves as a pivotal metric to gauge the watermark's imperceptibility and video quality.The study explores various irreducible polynomials,navigating the trade-offs between computational efficiency and watermark imperceptibility.In parallel,the study pays careful attention to the execution time,a paramount consideration in real-world scenarios,to strike a balance between efficiency and efficacy.This comprehensive analysis provides valuable insights into the interplay of GF multiplication tables,diverse irreducible polynomials,scene selection,adaptive thresholding,imperceptibility,and execution time.The evaluation of the proposed algorithm's robustness was conducted using PSNR and NC metrics,and it was subjected to assessment under the impact of five distinct attack scenarios.These findings contribute to the development of watermarking strategies that balance imperceptibility,robustness,and processing efficiency,enhancing the field's practicality and effectiveness. 展开更多
关键词 video watermarking galois field irreducible polynomial multiplication table scene selection adaptive thresholding
下载PDF
TEAM:Transformer Encoder Attention Module for Video Classification
20
作者 Hae Sung Park Yong Suk Choi 《Computer Systems Science & Engineering》 2024年第2期451-477,共27页
Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,V... Much like humans focus solely on object movement to understand actions,directing a deep learning model’s attention to the core contexts within videos is crucial for improving video comprehension.In the recent study,Video Masked Auto-Encoder(VideoMAE)employs a pre-training approach with a high ratio of tube masking and reconstruction,effectively mitigating spatial bias due to temporal redundancy in full video frames.This steers the model’s focus toward detailed temporal contexts.However,as the VideoMAE still relies on full video frames during the action recognition stage,it may exhibit a progressive shift in attention towards spatial contexts,deteriorating its ability to capture the main spatio-temporal contexts.To address this issue,we propose an attention-directing module named Transformer Encoder Attention Module(TEAM).This proposed module effectively directs the model’s attention to the core characteristics within each video,inherently mitigating spatial bias.The TEAM first figures out the core features among the overall extracted features from each video.After that,it discerns the specific parts of the video where those features are located,encouraging the model to focus more on these informative parts.Consequently,during the action recognition stage,the proposed TEAM effectively shifts the VideoMAE’s attention from spatial contexts towards the core spatio-temporal contexts.This attention-shift manner alleviates the spatial bias in the model and simultaneously enhances its ability to capture precise video contexts.We conduct extensive experiments to explore the optimal configuration that enables the TEAM to fulfill its intended design purpose and facilitates its seamless integration with the VideoMAE framework.The integrated model,i.e.,VideoMAE+TEAM,outperforms the existing VideoMAE by a significant margin on Something-Something-V2(71.3%vs.70.3%).Moreover,the qualitative comparisons demonstrate that the TEAM encourages the model to disregard insignificant features and focus more on the essential video features,capturing more detailed spatio-temporal contexts within the video. 展开更多
关键词 video classification action recognition vision transformer masked auto-encoder
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部