Type 2 diabetes mellitus(T2DM)is a complex metabolic disease threatening human health.We investigated the effects of Tegillarca granosa polysaccharide(TGP)and determined its potential mechanisms in a mouse model of T2...Type 2 diabetes mellitus(T2DM)is a complex metabolic disease threatening human health.We investigated the effects of Tegillarca granosa polysaccharide(TGP)and determined its potential mechanisms in a mouse model of T2DM established through a high-fat diet and streptozotocin.TGP(5.1×10^(3) Da)was composed of mannose,glucosamine,rhamnose,glucuronic acid,galactosamine,glucose,galactose,xylose,and fucose.It could significantly alleviate weight loss,reduce fasting blood glucose levels,reverse dyslipidemia,reduce liver damage from oxidative stress,and improve insulin sensitivity.RT-PCR and Western blotting indicated that TGP could activate the phosphatidylinositol-3-kinase/protein kinase B signaling pathway to regulate disorders in glucolipid metabolism and improve insulin resistance.TGP increased the abundance of Allobaculum,Akkermansia,and Bifidobacterium,restored the microbiota abundance in the intestinal tracts of mice with T2DM,and promoted short-chain fatty acid production.This study provides new insights into the antidiabetic effects of TGP and highlights its potential as a natural hypoglycemic nutraceutical.展开更多
Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It...Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It is also playing an essential role in devolving human-robot interaction.The dense video description is more difficult when compared with simple Video captioning because of the object’s interactions and event overlapping.Deep learning is changing the shape of computer vision(CV)technologies and natural language processing(NLP).There are hundreds of deep learning models,datasets,and evaluations that can improve the gaps in current research.This article filled this gap by evaluating some state-of-the-art approaches,especially focusing on deep learning and machine learning for video caption in a dense environment.In this article,some classic techniques concerning the existing machine learning were reviewed.And provides deep learning models,a detail of benchmark datasets with their respective domains.This paper reviews various evaluation metrics,including Bilingual EvaluationUnderstudy(BLEU),Metric for Evaluation of Translation with Explicit Ordering(METEOR),WordMover’s Distance(WMD),and Recall-Oriented Understudy for Gisting Evaluation(ROUGE)with their pros and cons.Finally,this article listed some future directions and proposed work for context enhancement using key scene extraction with object detection in a particular frame.Especially,how to improve the context of video description by analyzing key frames detection through morphological image analysis.Additionally,the paper discusses a novel approach involving sentence reconstruction and context improvement through key frame object detection,which incorporates the fusion of large languagemodels for refining results.The ultimate results arise fromenhancing the generated text of the proposedmodel by improving the predicted text and isolating objects using various keyframes.These keyframes identify dense events occurring in the video sequence.展开更多
Na^(+)/K^(+)-ATPase is a transmembrane protein that has important roles in the maintenance of electrochemical gradients across cell membranes by transporting three Na^(+)out of and two K^(+)into cells.Additionally,Na^...Na^(+)/K^(+)-ATPase is a transmembrane protein that has important roles in the maintenance of electrochemical gradients across cell membranes by transporting three Na^(+)out of and two K^(+)into cells.Additionally,Na^(+)/K^(+)-ATPase participates in Ca^(2+)-signaling transduction and neurotransmitter release by coordinating the ion concentration gradient across the cell membrane.Na^(+)/K^(+)-ATPase works synergistically with multiple ion channels in the cell membrane to form a dynamic network of ion homeostatic regulation and affects cellular communication by regulating chemical signals and the ion balance among different types of cells.Therefo re,it is not surprising that Na^(+)/K^(+)-ATPase dysfunction has emerged as a risk factor for a variety of neurological diseases.However,published studies have so far only elucidated the important roles of Na^(+)/K^(+)-ATPase dysfunction in disease development,and we are lacking detailed mechanisms to clarify how Na^(+)/K^(+)-ATPase affects cell function.Our recent studies revealed that membrane loss of Na^(+)/K^(+)-ATPase is a key mechanism in many neurological disorders,particularly stroke and Parkinson's disease.Stabilization of plasma membrane Na^(+)/K^(+)-ATPase with an antibody is a novel strategy to treat these diseases.For this reason,Na^(+)/K^(+)-ATPase acts not only as a simple ion pump but also as a sensor/regulator or cytoprotective protein,participating in signal transduction such as neuronal autophagy and apoptosis,and glial cell migration.Thus,the present review attempts to summarize the novel biological functions of Na^(+)/K^(+)-ATPase and Na^(+)/K^(+)-ATPase-related pathogenesis.The potential for novel strategies to treat Na^(+)/K^(+)-ATPase-related brain diseases will also be discussed.展开更多
Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the...Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the technical and business issues surrounding cloudbased live streaming is provided,including the benefits of cloud computing,the various live streaming architectures,and the challenges that live streaming service providers face in delivering high‐quality,real‐time services.The different techniques used to improve the performance of video streaming,such as adaptive bit‐rate streaming,multicast distribution,and edge computing are discussed and the necessity of low‐latency and high‐quality video transmission in cloud‐based live streaming is underlined.Issues such as improving user experience and live streaming service performance using cutting‐edge technology,like artificial intelligence and machine learning are discussed.In addition,the legal and regulatory implications of cloud‐based live streaming,including issues with network neutrality,data privacy,and content moderation are addressed.The future of cloud computing for live streaming is examined in the section that follows,and it looks at the most likely new developments in terms of trends and technology.For technology vendors,live streaming service providers,and regulators,the findings have major policy‐relevant implications.Suggestions on how stakeholders should address these concerns and take advantage of the potential presented by this rapidly evolving sector,as well as insights into the key challenges and opportunities associated with cloud‐based live streaming are provided.展开更多
In this paper, a two-dimensional(2D) DOA estimation algorithm of coherent signals with a separated linear acoustic vector-sensor(AVS) array consisting of two sparse AVS arrays is proposed. Firstly,the partitioned spat...In this paper, a two-dimensional(2D) DOA estimation algorithm of coherent signals with a separated linear acoustic vector-sensor(AVS) array consisting of two sparse AVS arrays is proposed. Firstly,the partitioned spatial smoothing(PSS) technique is used to construct a block covariance matrix, so as to decorrelate the coherency of signals. Then a signal subspace can be obtained by singular value decomposition(SVD) of the covariance matrix. Using the signal subspace, two extended signal subspaces are constructed to compensate aperture loss caused by PSS.The elevation angles can be estimated by estimation of signal parameter via rotational invariance techniques(ESPRIT) algorithm. At last, the estimated elevation angles can be used to estimate automatically paired azimuth angles. Compared with some other ESPRIT algorithms, the proposed algorithm shows higher estimation accuracy, which can be proved through the simulation results.展开更多
Among steganalysis techniques,detection against MV(motion vector)domain-based video steganography in the HEVC(High Efficiency Video Coding)standard remains a challenging issue.For the purpose of improving the detectio...Among steganalysis techniques,detection against MV(motion vector)domain-based video steganography in the HEVC(High Efficiency Video Coding)standard remains a challenging issue.For the purpose of improving the detection performance,this paper proposes a steganalysis method that can perfectly detectMV-based steganography in HEVC.Firstly,we define the local optimality of MVP(Motion Vector Prediction)based on the technology of AMVP(Advanced Motion Vector Prediction).Secondly,we analyze that in HEVC video,message embedding either usingMVP index orMVD(Motion Vector Difference)may destroy the above optimality of MVP.And then,we define the optimal rate of MVP as a steganalysis feature.Finally,we conduct steganalysis detection experiments on two general datasets for three popular steganographymethods and compare the performance with four state-ofthe-art steganalysis methods.The experimental results demonstrate the effectiveness of the proposed feature set.Furthermore,our method stands out for its practical applicability,requiring no model training and exhibiting low computational complexity,making it a viable solution for real-world scenarios.展开更多
In the realm of acoustic signal detection,the identification of weak signals,particularly in the presence of negative signal-to-noise ratios,poses a significant challenge.This challenge is further heightened when sign...In the realm of acoustic signal detection,the identification of weak signals,particularly in the presence of negative signal-to-noise ratios,poses a significant challenge.This challenge is further heightened when signals are acquired through fiber-optic hydrophones,as these signals often lack physical significance and resist clear systematic modeling.Conventional processing methods,e.g.,low-pass filter(LPF),require a thorough understanding of the effective signal bandwidth for noise reduction,and may introduce undesirable time lags.This paper introduces an innovative feedback control method with dual Kalman filters for the demodulation of phase signals with noises in fiber-optic hydrophones.A mathematical model of the closed-loop system is established to guide the design of the feedback control,aiming to achieve a balance with the input phase signal.The dual Kalman filters are instrumental in mitigating the effects of signal noise,observation noise,and control execution noise,thereby enabling precise estimation for the input phase signals.The effectiveness of this feedback control method is demonstrated through examples,showcasing the restoration of low-noise signals,negative signal-to-noise ratio signals,and multi-frequency signals.This research contributes to the technical advancement of high-performance devices,including fiber-optic hydrophones and phase-locked amplifiers.展开更多
The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method in...The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method involves extracting structured data from video frames using facial landmark detection,which is then used as input to the CNN.The customized Convolutional Neural Network method is the date augmented-based CNN model to generate‘fake data’or‘fake images’.This study was carried out using Python and its libraries.We used 242 films from the dataset gathered by the Deep Fake Detection Challenge,of which 199 were made up and the remaining 53 were real.Ten seconds were allotted for each video.There were 318 videos used in all,199 of which were fake and 119 of which were real.Our proposedmethod achieved a testing accuracy of 91.47%,loss of 0.342,and AUC score of 0.92,outperforming two alternative approaches,CNN and MLP-CNN.Furthermore,our method succeeded in greater accuracy than contemporary models such as XceptionNet,Meso-4,EfficientNet-BO,MesoInception-4,VGG-16,and DST-Net.The novelty of this investigation is the development of a new Convolutional Neural Network(CNN)learning model that can accurately detect deep fake face photos.展开更多
In recent years, research on the estimation of human emotions has been active, and its application is expected in various fields. Biological reactions, such as electroencephalography (EEG) and root mean square success...In recent years, research on the estimation of human emotions has been active, and its application is expected in various fields. Biological reactions, such as electroencephalography (EEG) and root mean square successive difference (RMSSD), are indicators that are less influenced by individual arbitrariness. The present study used EEG and RMSSD signals to assess the emotions aroused by emotion-stimulating images in order to investigate whether various emotions are associated with characteristic biometric signal fluctuations. The participants underwent EEG and RMSSD while viewing emotionally stimulating images and answering the questionnaires. The emotions aroused by emotionally stimulating images were assessed by measuring the EEG signals and RMSSD values to determine whether different emotions are associated with characteristic biometric signal variations. Real-time emotion analysis software was used to identify the evoked emotions by describing them in the Circumplex Model of Affect based on the EEG signals and RMSSD values. Emotions other than happiness did not follow the Circumplex Model of Affect in this study. However, ventral attentional activity may have increased the RMSSD value for disgust as the β/θ value increased in right-sided brain waves. Therefore, the right-sided brain wave results are necessary when measuring disgust. Happiness can be assessed easily using the Circumplex Model of Affect for positive scene analysis. Improving the current analysis methods may facilitate the investigation of face-to-face communication in the future using biometric signals.展开更多
Massive amounts of data are acquired in modern and future information technology industries such as communication,radar,and remote sensing.The presence of large dimensionality and size in these data offers new opportu...Massive amounts of data are acquired in modern and future information technology industries such as communication,radar,and remote sensing.The presence of large dimensionality and size in these data offers new opportunities to enhance the performance of signal processing in such applications and even motivate new ones.However,the curse of dimensionality is always a challenge when processing such high-dimensional signals.In practical tasks,high-dimensional signals need to be acquired,processed,and analyzed with high accuracy,robustness,and computational efficiency.This special section aims to address these challenges,where articles attempt to develop new theories and methods that are best suited to the high dimensional nature of the signals involved,and explore modern and emerging applications in this area.展开更多
Objective: To study the problematic use of video games among secondary school students in the city of Parakou in 2023. Methods: Descriptive cross-sectional study conducted in the commune of Parakou from December 2022 ...Objective: To study the problematic use of video games among secondary school students in the city of Parakou in 2023. Methods: Descriptive cross-sectional study conducted in the commune of Parakou from December 2022 to July 2023. The study population consisted of students regularly enrolled in public and private secondary schools in the city of Parakou for the 2022-2023 academic year. A two-stage non-proportional stratified sampling technique combined with simple random sampling was adopted. The Problem Video Game Playing (PVP) scale was used to assess problem gambling in the study population, while anxiety and depression were assessed using the Hospital Anxiety and Depression Scale (HADS). Results: A total of 1030 students were included. The mean age of the pupils surveyed was 15.06 ± 2.68 years, with extremes of 10 and 28 years. The [13 - 18] age group was the most represented, with a proportion of 59.6% (614) in the general population. Females predominated, at 52.8% (544), with a sex ratio of 0.89. The prevalence of problematic video game use was 24.9%, measured using the Video Game Playing scale. Associated factors were male gender (p = 0.005), pocket money under 10,000 cfa (p = 0.001) and between 20,000 - 90,000 cfa (p = 0.030), addictive family behavior (p < 0.001), monogamous family (p = 0.023), good relationship with father (p = 0.020), organization of video game competitions (p = 0.001) and definite anxiety (p Conclusion: Substance-free addiction is struggling to attract the attention it deserves, as it did in its infancy everywhere else. This study complements existing data and serves as a reminder of the need to focus on this group of addictions, whose problematic use of video games remains the most frequent due to its accessibility and social tolerance. Preventive action combined with curative measures remains the most effective means of combating the problem at national level.展开更多
Colorectal cancer(CRC)remains one of the most commonly diagnosed and deadliest types of cancer worldwide.CRC displays a desmoplastic reaction(DR)that has been inversely associated with poor prognosis;less DR is associ...Colorectal cancer(CRC)remains one of the most commonly diagnosed and deadliest types of cancer worldwide.CRC displays a desmoplastic reaction(DR)that has been inversely associated with poor prognosis;less DR is associated with a better prognosis.This reaction generates excessive connective tissue,in which cancer-associated fibroblasts(CAFs)are critical cells that form a part of the tumor microenvironment.CAFs are directly involved in tumorigenesis through different mechanisms.However,their role in immunosuppression in CRC is not well understood,and the precise role of signal transducers and activators of transcription(STATs)in mediating CAF activity in CRC remains unclear.Among the myriad chemical and biological factors that affect CAFs,different cytokines mediate their function by activating STAT signaling pathways.Thus,the harmful effects of CAFs in favoring tumor growth and invasion may be modulated using STAT inhibitors.Here,we analyze the impact of different STATs on CAF activity and their immunoregulatory role.展开更多
The underwater wireless optical communication(UWOC)system has gradually become essential to underwater wireless communication technology.Unlike other existing works on UWOC systems,this paper evaluates the proposed ma...The underwater wireless optical communication(UWOC)system has gradually become essential to underwater wireless communication technology.Unlike other existing works on UWOC systems,this paper evaluates the proposed machine learningbased signal demodulation methods through the selfbuilt experimental platform.Based on such a platform,we first construct a real signal dataset with ten modulation methods.Then,we propose a deep belief network(DBN)-based demodulator for feature extraction and multi-class feature classification.We also design an adaptive boosting(Ada Boost)demodulator as an alternative scheme without feature filtering for multiple modulated signals.Finally,it is demonstrated by extensive experimental results that the Ada Boost demodulator significantly outperforms the other algorithms.It also reveals that the demodulator accuracy decreases as the modulation order increases for a fixed received optical power.A higher-order modulation may achieve a higher effective transmission rate when the signal-to-noise ratio(SNR)is higher.展开更多
Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract ...Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization.This method resulted in rapid exploration,indexing,and retrieval of massive video libraries.We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint(BRISK)and bisecting K-means clustering algorithm.The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the descriptors from video sequences.The video frames’BRISK features are clustered using a bisecting K-means,and the keyframe is determined by selecting the frame that is most near the cluster center.Without applying any clustering parameters,the appropriate clusters number is determined using the silhouette coefficient.Experiments were carried out on a publicly available open video project(OVP)dataset that contained videos of different genres.The proposed method’s effectiveness is compared to existing methods using a variety of evaluation metrics,and the proposed method achieves a trade-off between computational cost and quality.展开更多
In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is de...In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video contents.To address this issue,this paper proposes a video captioning method by semantic topic-guided generation.First,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the encoding.Then,the semantic topics of video data are extracted using the visual labels retrieved from similar video data.In the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video contents.During this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted words.Finally,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text datasets.The experimental results demonstrate that the proposed method outperforms several state-of-art approaches.Specifically,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset,respectively,compared with the existing video captioning methods.As a result,the proposed method can generate video captioning that is more closely aligned with human natural language expression habits.展开更多
Weak signal reception is a very important and challenging problem for communication systems especially in the presence of non-Gaussian noise,and in which case the performance of optimal linear correlated receiver degr...Weak signal reception is a very important and challenging problem for communication systems especially in the presence of non-Gaussian noise,and in which case the performance of optimal linear correlated receiver degrades dramatically.Aiming at this,a novel uncorrelated reception scheme based on adaptive bistable stochastic resonance(ABSR)for a weak signal in additive Laplacian noise is investigated.By analyzing the key issue that the quantitative cooperative resonance matching relationship between the characteristics of the noisy signal and the nonlinear bistable system,an analytical expression of the bistable system parameters is derived.On this basis,by means of bistable system parameters self-adaptive adjustment,the counterintuitive stochastic resonance(SR)phenomenon can be easily generated at which the random noise is changed into a benefit to assist signal transmission.Finally,it is demonstrated that approximately 8dB bit error ratio(BER)performance improvement for the ABSR-based uncorrelated receiver when compared with the traditional uncorrelated receiver at low signal to noise ratio(SNR)conditions varying from-30dB to-5dB.展开更多
A digital data-acquisition system based on XIA LLC products was used in a complex nuclear reaction experiment using radioactive ion beams.A flexible trigger system based on a field-programmable gate array(FPGA)paramet...A digital data-acquisition system based on XIA LLC products was used in a complex nuclear reaction experiment using radioactive ion beams.A flexible trigger system based on a field-programmable gate array(FPGA)parametrization was developed to adapt to different experimental sizes.A user-friendly interface was implemented,which allows converting script language expressions into FPGA internal control parameters.The proposed digital system can be combined with a conventional analog data acquisition system to provide more flexibility.The performance of the combined system was veri-fied using experimental data.展开更多
Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions i...Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd dataset serves as a catalyst for the development ofmore robust and effective fitnesstracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis.展开更多
High-resolution video transmission requires a substantial amount of bandwidth.In this paper,we present a novel video processing methodology that innovatively integrates region of interest(ROI)identification and super-...High-resolution video transmission requires a substantial amount of bandwidth.In this paper,we present a novel video processing methodology that innovatively integrates region of interest(ROI)identification and super-resolution enhancement.Our method commences with the accurate detection of ROIs within video sequences,followed by the application of advanced super-resolution techniques to these areas,thereby preserving visual quality while economizing on data transmission.To validate and benchmark our approach,we have curated a new gaming dataset tailored to evaluate the effectiveness of ROI-based super-resolution in practical applications.The proposed model architecture leverages the transformer network framework,guided by a carefully designed multi-task loss function,which facilitates concurrent learning and execution of both ROI identification and resolution enhancement tasks.This unified deep learning model exhibits remarkable performance in achieving super-resolution on our custom dataset.The implications of this research extend to optimizing low-bitrate video streaming scenarios.By selectively enhancing the resolution of critical regions in videos,our solution enables high-quality video delivery under constrained bandwidth conditions.Empirical results demonstrate a 15%reduction in transmission bandwidth compared to traditional super-resolution based compression methods,without any perceivable decline in visual quality.This work thus contributes to the advancement of video compression and enhancement technologies,offering an effective strategy for improving digital media delivery efficiency and user experience,especially in bandwidth-limited environments.The innovative integration of ROI identification and super-resolution presents promising avenues for future research and development in adaptive and intelligent video communication systems.展开更多
Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes an...Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes and their neighbors,but ignore the dynamic dependencies between nodes.To address this challenge,we propose an innovative Adaptive Graph Convolutional Adjacency Matrix Network(TAMGCN),leveraging the attention mechanism to dynamically adjust dependencies between graph nodes.Specifically,we first segment shots and extract features of each frame,then compute the representative features of each shot.Subsequently,we utilize the attention mechanism to dynamically adjust the adjacency matrix of the graph convolutional network to better capture the dynamic dependencies between graph nodes.Finally,we fuse temporal features extracted by Bi-directional Long Short-Term Memory network with structural features extracted by the graph convolutional network to generate high-quality summaries.Extensive experiments are conducted on two benchmark datasets,TVSum and SumMe,yielding F1-scores of 60.8%and 53.2%,respectively.Experimental results demonstrate that our method outperforms most state-of-the-art video summarization techniques.展开更多
基金funded by the National Key Research and Development Program of China(2020YFD0900902)Zhejiang Province Public Welfare Technology Application Research Project(LGJ21C20001)Zhejiang Provincial Key Research and Development Project of China(2019C02076 and 2019C02075)。
文摘Type 2 diabetes mellitus(T2DM)is a complex metabolic disease threatening human health.We investigated the effects of Tegillarca granosa polysaccharide(TGP)and determined its potential mechanisms in a mouse model of T2DM established through a high-fat diet and streptozotocin.TGP(5.1×10^(3) Da)was composed of mannose,glucosamine,rhamnose,glucuronic acid,galactosamine,glucose,galactose,xylose,and fucose.It could significantly alleviate weight loss,reduce fasting blood glucose levels,reverse dyslipidemia,reduce liver damage from oxidative stress,and improve insulin sensitivity.RT-PCR and Western blotting indicated that TGP could activate the phosphatidylinositol-3-kinase/protein kinase B signaling pathway to regulate disorders in glucolipid metabolism and improve insulin resistance.TGP increased the abundance of Allobaculum,Akkermansia,and Bifidobacterium,restored the microbiota abundance in the intestinal tracts of mice with T2DM,and promoted short-chain fatty acid production.This study provides new insights into the antidiabetic effects of TGP and highlights its potential as a natural hypoglycemic nutraceutical.
文摘Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It is also playing an essential role in devolving human-robot interaction.The dense video description is more difficult when compared with simple Video captioning because of the object’s interactions and event overlapping.Deep learning is changing the shape of computer vision(CV)technologies and natural language processing(NLP).There are hundreds of deep learning models,datasets,and evaluations that can improve the gaps in current research.This article filled this gap by evaluating some state-of-the-art approaches,especially focusing on deep learning and machine learning for video caption in a dense environment.In this article,some classic techniques concerning the existing machine learning were reviewed.And provides deep learning models,a detail of benchmark datasets with their respective domains.This paper reviews various evaluation metrics,including Bilingual EvaluationUnderstudy(BLEU),Metric for Evaluation of Translation with Explicit Ordering(METEOR),WordMover’s Distance(WMD),and Recall-Oriented Understudy for Gisting Evaluation(ROUGE)with their pros and cons.Finally,this article listed some future directions and proposed work for context enhancement using key scene extraction with object detection in a particular frame.Especially,how to improve the context of video description by analyzing key frames detection through morphological image analysis.Additionally,the paper discusses a novel approach involving sentence reconstruction and context improvement through key frame object detection,which incorporates the fusion of large languagemodels for refining results.The ultimate results arise fromenhancing the generated text of the proposedmodel by improving the predicted text and isolating objects using various keyframes.These keyframes identify dense events occurring in the video sequence.
基金supported by the National Natural Science Foundation of China,No.82173800 (to JB)Shenzhen Science and Technology Program,No.KQTD20200820113040070 (to JB)。
文摘Na^(+)/K^(+)-ATPase is a transmembrane protein that has important roles in the maintenance of electrochemical gradients across cell membranes by transporting three Na^(+)out of and two K^(+)into cells.Additionally,Na^(+)/K^(+)-ATPase participates in Ca^(2+)-signaling transduction and neurotransmitter release by coordinating the ion concentration gradient across the cell membrane.Na^(+)/K^(+)-ATPase works synergistically with multiple ion channels in the cell membrane to form a dynamic network of ion homeostatic regulation and affects cellular communication by regulating chemical signals and the ion balance among different types of cells.Therefo re,it is not surprising that Na^(+)/K^(+)-ATPase dysfunction has emerged as a risk factor for a variety of neurological diseases.However,published studies have so far only elucidated the important roles of Na^(+)/K^(+)-ATPase dysfunction in disease development,and we are lacking detailed mechanisms to clarify how Na^(+)/K^(+)-ATPase affects cell function.Our recent studies revealed that membrane loss of Na^(+)/K^(+)-ATPase is a key mechanism in many neurological disorders,particularly stroke and Parkinson's disease.Stabilization of plasma membrane Na^(+)/K^(+)-ATPase with an antibody is a novel strategy to treat these diseases.For this reason,Na^(+)/K^(+)-ATPase acts not only as a simple ion pump but also as a sensor/regulator or cytoprotective protein,participating in signal transduction such as neuronal autophagy and apoptosis,and glial cell migration.Thus,the present review attempts to summarize the novel biological functions of Na^(+)/K^(+)-ATPase and Na^(+)/K^(+)-ATPase-related pathogenesis.The potential for novel strategies to treat Na^(+)/K^(+)-ATPase-related brain diseases will also be discussed.
文摘Cloud computing has drastically changed the delivery and consumption of live streaming content.The designs,challenges,and possible uses of cloud computing for live streaming are studied.A comprehensive overview of the technical and business issues surrounding cloudbased live streaming is provided,including the benefits of cloud computing,the various live streaming architectures,and the challenges that live streaming service providers face in delivering high‐quality,real‐time services.The different techniques used to improve the performance of video streaming,such as adaptive bit‐rate streaming,multicast distribution,and edge computing are discussed and the necessity of low‐latency and high‐quality video transmission in cloud‐based live streaming is underlined.Issues such as improving user experience and live streaming service performance using cutting‐edge technology,like artificial intelligence and machine learning are discussed.In addition,the legal and regulatory implications of cloud‐based live streaming,including issues with network neutrality,data privacy,and content moderation are addressed.The future of cloud computing for live streaming is examined in the section that follows,and it looks at the most likely new developments in terms of trends and technology.For technology vendors,live streaming service providers,and regulators,the findings have major policy‐relevant implications.Suggestions on how stakeholders should address these concerns and take advantage of the potential presented by this rapidly evolving sector,as well as insights into the key challenges and opportunities associated with cloud‐based live streaming are provided.
基金supported by the National Natural Science Foundation of China (62261047,62066040)the Foundation of Top-notch Talents by Education Department of Guizhou Province of China (KY[2018]075)+3 种基金the Science and Technology Foundation of Guizhou Province of China (ZK[2022]557,[2020]1Y004)the Science and Technology Research Program of the Chongqing Municipal Education Commission (KJQN202200637)PhD Research Start-up Foundation of Tongren University (trxyDH1710)Tongren Science and Technology Planning Project ((2018)22)。
文摘In this paper, a two-dimensional(2D) DOA estimation algorithm of coherent signals with a separated linear acoustic vector-sensor(AVS) array consisting of two sparse AVS arrays is proposed. Firstly,the partitioned spatial smoothing(PSS) technique is used to construct a block covariance matrix, so as to decorrelate the coherency of signals. Then a signal subspace can be obtained by singular value decomposition(SVD) of the covariance matrix. Using the signal subspace, two extended signal subspaces are constructed to compensate aperture loss caused by PSS.The elevation angles can be estimated by estimation of signal parameter via rotational invariance techniques(ESPRIT) algorithm. At last, the estimated elevation angles can be used to estimate automatically paired azimuth angles. Compared with some other ESPRIT algorithms, the proposed algorithm shows higher estimation accuracy, which can be proved through the simulation results.
基金the National Natural Science Foundation of China(Grant Nos.62272478,62202496,61872384).
文摘Among steganalysis techniques,detection against MV(motion vector)domain-based video steganography in the HEVC(High Efficiency Video Coding)standard remains a challenging issue.For the purpose of improving the detection performance,this paper proposes a steganalysis method that can perfectly detectMV-based steganography in HEVC.Firstly,we define the local optimality of MVP(Motion Vector Prediction)based on the technology of AMVP(Advanced Motion Vector Prediction).Secondly,we analyze that in HEVC video,message embedding either usingMVP index orMVD(Motion Vector Difference)may destroy the above optimality of MVP.And then,we define the optimal rate of MVP as a steganalysis feature.Finally,we conduct steganalysis detection experiments on two general datasets for three popular steganographymethods and compare the performance with four state-ofthe-art steganalysis methods.The experimental results demonstrate the effectiveness of the proposed feature set.Furthermore,our method stands out for its practical applicability,requiring no model training and exhibiting low computational complexity,making it a viable solution for real-world scenarios.
基金Project supported by the National Key Research and Development Program of China(No.2022YFB3203600)the National Natural Science Foundation of China(Nos.12172323,12132013+1 种基金12332003)the Zhejiang Provincial Natural Science Foundation of China(No.LZ22A020003)。
文摘In the realm of acoustic signal detection,the identification of weak signals,particularly in the presence of negative signal-to-noise ratios,poses a significant challenge.This challenge is further heightened when signals are acquired through fiber-optic hydrophones,as these signals often lack physical significance and resist clear systematic modeling.Conventional processing methods,e.g.,low-pass filter(LPF),require a thorough understanding of the effective signal bandwidth for noise reduction,and may introduce undesirable time lags.This paper introduces an innovative feedback control method with dual Kalman filters for the demodulation of phase signals with noises in fiber-optic hydrophones.A mathematical model of the closed-loop system is established to guide the design of the feedback control,aiming to achieve a balance with the input phase signal.The dual Kalman filters are instrumental in mitigating the effects of signal noise,observation noise,and control execution noise,thereby enabling precise estimation for the input phase signals.The effectiveness of this feedback control method is demonstrated through examples,showcasing the restoration of low-noise signals,negative signal-to-noise ratio signals,and multi-frequency signals.This research contributes to the technical advancement of high-performance devices,including fiber-optic hydrophones and phase-locked amplifiers.
基金Science and Technology Funds from the Liaoning Education Department(Serial Number:LJKZ0104).
文摘The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method involves extracting structured data from video frames using facial landmark detection,which is then used as input to the CNN.The customized Convolutional Neural Network method is the date augmented-based CNN model to generate‘fake data’or‘fake images’.This study was carried out using Python and its libraries.We used 242 films from the dataset gathered by the Deep Fake Detection Challenge,of which 199 were made up and the remaining 53 were real.Ten seconds were allotted for each video.There were 318 videos used in all,199 of which were fake and 119 of which were real.Our proposedmethod achieved a testing accuracy of 91.47%,loss of 0.342,and AUC score of 0.92,outperforming two alternative approaches,CNN and MLP-CNN.Furthermore,our method succeeded in greater accuracy than contemporary models such as XceptionNet,Meso-4,EfficientNet-BO,MesoInception-4,VGG-16,and DST-Net.The novelty of this investigation is the development of a new Convolutional Neural Network(CNN)learning model that can accurately detect deep fake face photos.
文摘In recent years, research on the estimation of human emotions has been active, and its application is expected in various fields. Biological reactions, such as electroencephalography (EEG) and root mean square successive difference (RMSSD), are indicators that are less influenced by individual arbitrariness. The present study used EEG and RMSSD signals to assess the emotions aroused by emotion-stimulating images in order to investigate whether various emotions are associated with characteristic biometric signal fluctuations. The participants underwent EEG and RMSSD while viewing emotionally stimulating images and answering the questionnaires. The emotions aroused by emotionally stimulating images were assessed by measuring the EEG signals and RMSSD values to determine whether different emotions are associated with characteristic biometric signal variations. Real-time emotion analysis software was used to identify the evoked emotions by describing them in the Circumplex Model of Affect based on the EEG signals and RMSSD values. Emotions other than happiness did not follow the Circumplex Model of Affect in this study. However, ventral attentional activity may have increased the RMSSD value for disgust as the β/θ value increased in right-sided brain waves. Therefore, the right-sided brain wave results are necessary when measuring disgust. Happiness can be assessed easily using the Circumplex Model of Affect for positive scene analysis. Improving the current analysis methods may facilitate the investigation of face-to-face communication in the future using biometric signals.
文摘Massive amounts of data are acquired in modern and future information technology industries such as communication,radar,and remote sensing.The presence of large dimensionality and size in these data offers new opportunities to enhance the performance of signal processing in such applications and even motivate new ones.However,the curse of dimensionality is always a challenge when processing such high-dimensional signals.In practical tasks,high-dimensional signals need to be acquired,processed,and analyzed with high accuracy,robustness,and computational efficiency.This special section aims to address these challenges,where articles attempt to develop new theories and methods that are best suited to the high dimensional nature of the signals involved,and explore modern and emerging applications in this area.
文摘Objective: To study the problematic use of video games among secondary school students in the city of Parakou in 2023. Methods: Descriptive cross-sectional study conducted in the commune of Parakou from December 2022 to July 2023. The study population consisted of students regularly enrolled in public and private secondary schools in the city of Parakou for the 2022-2023 academic year. A two-stage non-proportional stratified sampling technique combined with simple random sampling was adopted. The Problem Video Game Playing (PVP) scale was used to assess problem gambling in the study population, while anxiety and depression were assessed using the Hospital Anxiety and Depression Scale (HADS). Results: A total of 1030 students were included. The mean age of the pupils surveyed was 15.06 ± 2.68 years, with extremes of 10 and 28 years. The [13 - 18] age group was the most represented, with a proportion of 59.6% (614) in the general population. Females predominated, at 52.8% (544), with a sex ratio of 0.89. The prevalence of problematic video game use was 24.9%, measured using the Video Game Playing scale. Associated factors were male gender (p = 0.005), pocket money under 10,000 cfa (p = 0.001) and between 20,000 - 90,000 cfa (p = 0.030), addictive family behavior (p < 0.001), monogamous family (p = 0.023), good relationship with father (p = 0.020), organization of video game competitions (p = 0.001) and definite anxiety (p Conclusion: Substance-free addiction is struggling to attract the attention it deserves, as it did in its infancy everywhere else. This study complements existing data and serves as a reminder of the need to focus on this group of addictions, whose problematic use of video games remains the most frequent due to its accessibility and social tolerance. Preventive action combined with curative measures remains the most effective means of combating the problem at national level.
基金Supported by the Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica(PAPIIT)de la Dirección General de Asuntos de Personal Académico,No.IN212722 and No.IA208424Consejo Mexiquense de Ciencia y Tecnología,No.CS000132Consejo Nacional de Humanidades,Ciencia y Tecnología,No.CF-2023-I-563.
文摘Colorectal cancer(CRC)remains one of the most commonly diagnosed and deadliest types of cancer worldwide.CRC displays a desmoplastic reaction(DR)that has been inversely associated with poor prognosis;less DR is associated with a better prognosis.This reaction generates excessive connective tissue,in which cancer-associated fibroblasts(CAFs)are critical cells that form a part of the tumor microenvironment.CAFs are directly involved in tumorigenesis through different mechanisms.However,their role in immunosuppression in CRC is not well understood,and the precise role of signal transducers and activators of transcription(STATs)in mediating CAF activity in CRC remains unclear.Among the myriad chemical and biological factors that affect CAFs,different cytokines mediate their function by activating STAT signaling pathways.Thus,the harmful effects of CAFs in favoring tumor growth and invasion may be modulated using STAT inhibitors.Here,we analyze the impact of different STATs on CAF activity and their immunoregulatory role.
基金supported by the major key project of Peng Cheng Laboratory under grant PCL2023AS31 and PCL2023AS1-2the National Key Research and Development Program of China(No.2019YFA0706604)the Natural Science Foundation(NSF)of China(Nos.61976169,62293483,62371451)。
文摘The underwater wireless optical communication(UWOC)system has gradually become essential to underwater wireless communication technology.Unlike other existing works on UWOC systems,this paper evaluates the proposed machine learningbased signal demodulation methods through the selfbuilt experimental platform.Based on such a platform,we first construct a real signal dataset with ten modulation methods.Then,we propose a deep belief network(DBN)-based demodulator for feature extraction and multi-class feature classification.We also design an adaptive boosting(Ada Boost)demodulator as an alternative scheme without feature filtering for multiple modulated signals.Finally,it is demonstrated by extensive experimental results that the Ada Boost demodulator significantly outperforms the other algorithms.It also reveals that the demodulator accuracy decreases as the modulation order increases for a fixed received optical power.A higher-order modulation may achieve a higher effective transmission rate when the signal-to-noise ratio(SNR)is higher.
基金The authors would like to thank Research Supporting Project Number(RSP2024R444)King Saud University,Riyadh,Saudi Arabia.
文摘Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization.This method resulted in rapid exploration,indexing,and retrieval of massive video libraries.We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint(BRISK)and bisecting K-means clustering algorithm.The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the descriptors from video sequences.The video frames’BRISK features are clustered using a bisecting K-means,and the keyframe is determined by selecting the frame that is most near the cluster center.Without applying any clustering parameters,the appropriate clusters number is determined using the silhouette coefficient.Experiments were carried out on a publicly available open video project(OVP)dataset that contained videos of different genres.The proposed method’s effectiveness is compared to existing methods using a variety of evaluation metrics,and the proposed method achieves a trade-off between computational cost and quality.
基金supported in part by the National Natural Science Foundation of China under Grant 61873277in part by the Natural Science Basic Research Plan in Shaanxi Province of China underGrant 2020JQ-758in part by the Chinese Postdoctoral Science Foundation under Grant 2020M673446.
文摘In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video contents.To address this issue,this paper proposes a video captioning method by semantic topic-guided generation.First,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the encoding.Then,the semantic topics of video data are extracted using the visual labels retrieved from similar video data.In the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video contents.During this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted words.Finally,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text datasets.The experimental results demonstrate that the proposed method outperforms several state-of-art approaches.Specifically,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset,respectively,compared with the existing video captioning methods.As a result,the proposed method can generate video captioning that is more closely aligned with human natural language expression habits.
基金supported in part by the National Natural Science Foundation of China(62001356)in part by the National Natural Science Foundation for Distinguished Young Scholar(61825104)+1 种基金in part by the National Key Research and Development Program of China(2022YFC3301300)in part by the Innovative Research Groups of the National Natural Science Foundation of China(62121001)。
文摘Weak signal reception is a very important and challenging problem for communication systems especially in the presence of non-Gaussian noise,and in which case the performance of optimal linear correlated receiver degrades dramatically.Aiming at this,a novel uncorrelated reception scheme based on adaptive bistable stochastic resonance(ABSR)for a weak signal in additive Laplacian noise is investigated.By analyzing the key issue that the quantitative cooperative resonance matching relationship between the characteristics of the noisy signal and the nonlinear bistable system,an analytical expression of the bistable system parameters is derived.On this basis,by means of bistable system parameters self-adaptive adjustment,the counterintuitive stochastic resonance(SR)phenomenon can be easily generated at which the random noise is changed into a benefit to assist signal transmission.Finally,it is demonstrated that approximately 8dB bit error ratio(BER)performance improvement for the ABSR-based uncorrelated receiver when compared with the traditional uncorrelated receiver at low signal to noise ratio(SNR)conditions varying from-30dB to-5dB.
基金This work was supported by the National Key R&D Program of China(Nos.2023YFA1606403 and 2023YFE0101600)the National Natural Science Foundation of China(Nos.12027809,11961141003,U1967201,11875073 and 11875074).
文摘A digital data-acquisition system based on XIA LLC products was used in a complex nuclear reaction experiment using radioactive ion beams.A flexible trigger system based on a field-programmable gate array(FPGA)parametrization was developed to adapt to different experimental sizes.A user-friendly interface was implemented,which allows converting script language expressions into FPGA internal control parameters.The proposed digital system can be combined with a conventional analog data acquisition system to provide more flexibility.The performance of the combined system was veri-fied using experimental data.
文摘Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd dataset serves as a catalyst for the development ofmore robust and effective fitnesstracking systems and ultimately promotes healthier lifestyles through improved exercise monitoring and analysis.
基金funded by National Key Research and Development Program of China(No.2022YFC3302103).
文摘High-resolution video transmission requires a substantial amount of bandwidth.In this paper,we present a novel video processing methodology that innovatively integrates region of interest(ROI)identification and super-resolution enhancement.Our method commences with the accurate detection of ROIs within video sequences,followed by the application of advanced super-resolution techniques to these areas,thereby preserving visual quality while economizing on data transmission.To validate and benchmark our approach,we have curated a new gaming dataset tailored to evaluate the effectiveness of ROI-based super-resolution in practical applications.The proposed model architecture leverages the transformer network framework,guided by a carefully designed multi-task loss function,which facilitates concurrent learning and execution of both ROI identification and resolution enhancement tasks.This unified deep learning model exhibits remarkable performance in achieving super-resolution on our custom dataset.The implications of this research extend to optimizing low-bitrate video streaming scenarios.By selectively enhancing the resolution of critical regions in videos,our solution enables high-quality video delivery under constrained bandwidth conditions.Empirical results demonstrate a 15%reduction in transmission bandwidth compared to traditional super-resolution based compression methods,without any perceivable decline in visual quality.This work thus contributes to the advancement of video compression and enhancement technologies,offering an effective strategy for improving digital media delivery efficiency and user experience,especially in bandwidth-limited environments.The innovative integration of ROI identification and super-resolution presents promising avenues for future research and development in adaptive and intelligent video communication systems.
基金This work was supported by Natural Science Foundation of Gansu Province under Grant Nos.21JR7RA570,20JR10RA334Basic Research Program of Gansu Province No.22JR11RA106,Gansu University of Political Science and Law Major Scientific Research and Innovation Projects under Grant No.GZF2020XZDA03+1 种基金the Young Doctoral Fund Project of Higher Education Institutions in Gansu Province in 2022 under Grant No.2022QB-123,Gansu Province Higher Education Innovation Fund Project under Grant No.2022A-097the University-Level Research Funding Project under Grant No.GZFXQNLW022 and University-Level Innovative Research Team of Gansu University of Political Science and Law.
文摘Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes and their neighbors,but ignore the dynamic dependencies between nodes.To address this challenge,we propose an innovative Adaptive Graph Convolutional Adjacency Matrix Network(TAMGCN),leveraging the attention mechanism to dynamically adjust dependencies between graph nodes.Specifically,we first segment shots and extract features of each frame,then compute the representative features of each shot.Subsequently,we utilize the attention mechanism to dynamically adjust the adjacency matrix of the graph convolutional network to better capture the dynamic dependencies between graph nodes.Finally,we fuse temporal features extracted by Bi-directional Long Short-Term Memory network with structural features extracted by the graph convolutional network to generate high-quality summaries.Extensive experiments are conducted on two benchmark datasets,TVSum and SumMe,yielding F1-scores of 60.8%and 53.2%,respectively.Experimental results demonstrate that our method outperforms most state-of-the-art video summarization techniques.