Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion...Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models.展开更多
We propose a scheme to implement an unconventional geometric logic gate separately in a two-mode cavity and a multi-mode cavity assisted by a strong classical driving field. The effect of the cavity decay is included ...We propose a scheme to implement an unconventional geometric logic gate separately in a two-mode cavity and a multi-mode cavity assisted by a strong classical driving field. The effect of the cavity decay is included in the investigation. The numerical calculation is carried out, and the result shows that our scheme is more tolerant to cavity decay than the previous one because the time consumed for finishing the logic gate is doubly reduced.展开更多
This paper presents a reconfigurable RF front-end for multi-mode multi-standard(MMMS) applications. The designed RF front-end is fabricated in 0.18 μm RF CMOS technology. The low noise characteristic is achieved by t...This paper presents a reconfigurable RF front-end for multi-mode multi-standard(MMMS) applications. The designed RF front-end is fabricated in 0.18 μm RF CMOS technology. The low noise characteristic is achieved by the noise canceling technique while the bandwidth is enhanced by gate inductive peaking technique. Measurement results show that, while the input frequency ranges from 100 MHz to 2.9 GHz, the proposed reconfigurable RF front-end achieves a controllable voltage conversion gain(VCG) from 18 dB to 39 dB. The measured maximum input third intercept point(IIP3) is-4.9 dBm and the minimum noise figure(NF) is 4.6 dB. The consumed current ranges from 16 mA to 26.5 mA from a 1.8 V supply voltage. The chip occupies an area of 1.17 mm^2 including pads.展开更多
Realtime analyzing the feeding behavior of fish is the premise and key to accurate guidance on feeding.The identification of fish behavior using a single information is susceptible to various factors.To overcome the p...Realtime analyzing the feeding behavior of fish is the premise and key to accurate guidance on feeding.The identification of fish behavior using a single information is susceptible to various factors.To overcome the problems,this paper proposes an adaptive deep modular co-attention unified multi-modal transformers(DMCA-UMT).By fusing the video,audio and water quality parameters,the whole process of fish feeding behavior could be identified.Firstly,for the input video,audio and water quality parameter information,features are extracted to obtain feature vectors of different modalities.Secondly,deep modular co-attention(DMCA)is introduced on the basis of the original cross-modal encoder,and the adaptive learnable weights are added.The feature vector of video and audio joint representation is obtained by automatic learning based on fusion contribution.Finally,the information of visual-audio modality fusion and text features are used to generate clip-level moment queries.The query decoder decodes the input features and uses the prediction head to obtain the final joint moment retrieval,which is the start-end time of feeding the fish.The results show that the mAP Avg of the proposed algorithm reaches 75.3%,which is37.8%higher than that of unified multi-modal transformers(UMT)algorithm.展开更多
An adjustable mixer for surface acoustic wave( SAW)-less radio frequency( RF) front-end is presented in this paper. Through changing the bias voltage,the presented mixer with reconfigurable voltage conversion gain( VC...An adjustable mixer for surface acoustic wave( SAW)-less radio frequency( RF) front-end is presented in this paper. Through changing the bias voltage,the presented mixer with reconfigurable voltage conversion gain( VCG) is suitable for multi-mode multi-standard( MMMS) applications. An equivalent local oscillator( LO) frequency-tunable high-Q band-pass filter( BPF) at low noise amplifier( LNA) output is used to reject the out-of-band interference signals. Base-band( BB) capacitor of the mixer is variable to obtain 15 kinds of intermediate frequency( IF) bandwidth( BW). The proposed passive mixer with LNA is implemented in TSMC 0. 18μm RF CMOS process and operates from 0. 5 to 2. 5 GHz with measured maximum out-of-band rejection larger than 40 d B. The measured VCG of the front-end can be changed from 5 to 17 d B; the maximum input intercept point( IIP3) is0 d Bm and the minimum noise figure( NF) is 3. 7 d B. The chip occupies an area of 0. 44 mm^2 including pads.展开更多
In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the e...In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.展开更多
基金supported by the National Natural Science Foundation of China(No.62302540)with author Fangfang Shan.For more information,please visit their website at https://www.nsfc.gov.cn/(accessed on 31/05/2024)+3 种基金Additionally,it is also funded by the Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020)where Fangfang Shan is an author.Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/(accessed on 31/05/2024)supported by the Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422)for more information,you can visit https://kjt.henan.gov.cn/2022/09-02/2599082.html(accessed on 31/05/2024).
文摘Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models.
基金Project supported by the National Natural Science Foundation of China(Grant No.50874041)the Funds of Hunan Educational Bureau,China(Grant No.09C314)
文摘We propose a scheme to implement an unconventional geometric logic gate separately in a two-mode cavity and a multi-mode cavity assisted by a strong classical driving field. The effect of the cavity decay is included in the investigation. The numerical calculation is carried out, and the result shows that our scheme is more tolerant to cavity decay than the previous one because the time consumed for finishing the logic gate is doubly reduced.
基金Supported by the National Nature Science Foundation of China(No.61674037)the Priority Academic Program Development of Jiangsu Higher Education Institutions,the National Power Grid Corp Science and Technology Project(No.SGTYHT/16-JS-198)the State Grid Nanjing Power Supply Company Project(No.1701052)
文摘This paper presents a reconfigurable RF front-end for multi-mode multi-standard(MMMS) applications. The designed RF front-end is fabricated in 0.18 μm RF CMOS technology. The low noise characteristic is achieved by the noise canceling technique while the bandwidth is enhanced by gate inductive peaking technique. Measurement results show that, while the input frequency ranges from 100 MHz to 2.9 GHz, the proposed reconfigurable RF front-end achieves a controllable voltage conversion gain(VCG) from 18 dB to 39 dB. The measured maximum input third intercept point(IIP3) is-4.9 dBm and the minimum noise figure(NF) is 4.6 dB. The consumed current ranges from 16 mA to 26.5 mA from a 1.8 V supply voltage. The chip occupies an area of 1.17 mm^2 including pads.
基金supported by the Beijing Natural Science Foundation(No.6212007)the National Key Technology R&D Program of China(No.2022YFD2001701)the Youth Research Fund of Beijing Academy of Agricultural and Forestry Sciences(No.QNJJ202014)。
文摘Realtime analyzing the feeding behavior of fish is the premise and key to accurate guidance on feeding.The identification of fish behavior using a single information is susceptible to various factors.To overcome the problems,this paper proposes an adaptive deep modular co-attention unified multi-modal transformers(DMCA-UMT).By fusing the video,audio and water quality parameters,the whole process of fish feeding behavior could be identified.Firstly,for the input video,audio and water quality parameter information,features are extracted to obtain feature vectors of different modalities.Secondly,deep modular co-attention(DMCA)is introduced on the basis of the original cross-modal encoder,and the adaptive learnable weights are added.The feature vector of video and audio joint representation is obtained by automatic learning based on fusion contribution.Finally,the information of visual-audio modality fusion and text features are used to generate clip-level moment queries.The query decoder decodes the input features and uses the prediction head to obtain the final joint moment retrieval,which is the start-end time of feeding the fish.The results show that the mAP Avg of the proposed algorithm reaches 75.3%,which is37.8%higher than that of unified multi-modal transformers(UMT)algorithm.
基金Supported by the National Basic Research Program of China(No.2010CB327404)the Priority Academic Program Development of Jiangsu Higher Education Institutions
文摘An adjustable mixer for surface acoustic wave( SAW)-less radio frequency( RF) front-end is presented in this paper. Through changing the bias voltage,the presented mixer with reconfigurable voltage conversion gain( VCG) is suitable for multi-mode multi-standard( MMMS) applications. An equivalent local oscillator( LO) frequency-tunable high-Q band-pass filter( BPF) at low noise amplifier( LNA) output is used to reject the out-of-band interference signals. Base-band( BB) capacitor of the mixer is variable to obtain 15 kinds of intermediate frequency( IF) bandwidth( BW). The proposed passive mixer with LNA is implemented in TSMC 0. 18μm RF CMOS process and operates from 0. 5 to 2. 5 GHz with measured maximum out-of-band rejection larger than 40 d B. The measured VCG of the front-end can be changed from 5 to 17 d B; the maximum input intercept point( IIP3) is0 d Bm and the minimum noise figure( NF) is 3. 7 d B. The chip occupies an area of 0. 44 mm^2 including pads.
文摘In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.