Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide...Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches.展开更多
The robust attitude control for a novel coaxial twelve-rotor UAV which has much greater payload capacity,higher drive capability and damage tolerance than a quad-rotor UAV is studied. Firstly,a dynamical and kinematic...The robust attitude control for a novel coaxial twelve-rotor UAV which has much greater payload capacity,higher drive capability and damage tolerance than a quad-rotor UAV is studied. Firstly,a dynamical and kinematical model for the coaxial twelve-rotor UAV is designed. Considering model uncertainties and external disturbances,a robust backstepping sliding mode control( BSMC) with self recurrent wavelet neural network( SRWNN) method is proposed as the attitude controller for the coaxial twelve-rotor. A combinative algorithm of backstepping control and sliding mode control has simplified design procedures with much stronger robustness benefiting from advantages of both controllers. SRWNN as the uncertainty observer is able to estimate the lumped uncertainties effectively.Then the uniformly ultimate stability of the twelve-rotor system is proved by Lyapunov stability theorem. Finally,the validity of the proposed robust control method adopted in the twelve-rotor UAV under model uncertainties and external disturbances are demonstrated via numerical simulations and twelve-rotor prototype experiments.展开更多
基金supported in part by the National Natural Science Foundation of China under Grants 62273272 and 61873277in part by the Chinese Postdoctoral Science Foundation under Grant 2020M673446+1 种基金in part by the Key Research and Development Program of Shaanxi Province under Grant 2023-YBGY-243in part by the Youth Innovation Team of Shaanxi Universities.
文摘Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches.
基金Supported by the National Natural Science Foundation of China(No.11372309,61304017)Science and Technology Development Plan Key Project of Jilin Province(No.20150204074GX)the Science and Technology Special Fund Project of Provincial Academy Cooperation(No.2017SYHZ00024)
文摘The robust attitude control for a novel coaxial twelve-rotor UAV which has much greater payload capacity,higher drive capability and damage tolerance than a quad-rotor UAV is studied. Firstly,a dynamical and kinematical model for the coaxial twelve-rotor UAV is designed. Considering model uncertainties and external disturbances,a robust backstepping sliding mode control( BSMC) with self recurrent wavelet neural network( SRWNN) method is proposed as the attitude controller for the coaxial twelve-rotor. A combinative algorithm of backstepping control and sliding mode control has simplified design procedures with much stronger robustness benefiting from advantages of both controllers. SRWNN as the uncertainty observer is able to estimate the lumped uncertainties effectively.Then the uniformly ultimate stability of the twelve-rotor system is proved by Lyapunov stability theorem. Finally,the validity of the proposed robust control method adopted in the twelve-rotor UAV under model uncertainties and external disturbances are demonstrated via numerical simulations and twelve-rotor prototype experiments.