期刊文献+
共找到151篇文章
< 1 2 8 >
每页显示 20 50 100
A Hand Features Based Fusion Recognition Network with Enhancing Multi-Modal Correlation
1
作者 Wei Wu Yuan Zhang +2 位作者 Yunpeng Li Chuanyang Li YanHao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期537-555,共19页
Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and ... Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and recognition performance of the system can be enhanced through judiciously leveraging the correlation among multimodal features.Nevertheless,two issues persist in multi-modal feature fusion recognition:Firstly,the enhancement of recognition performance in fusion recognition has not comprehensively considered the inter-modality correlations among distinct modalities.Secondly,during modal fusion,improper weight selection diminishes the salience of crucial modal features,thereby diminishing the overall recognition performance.To address these two issues,we introduce an enhanced DenseNet multimodal recognition network founded on feature-level fusion.The information from the three modalities is fused akin to RGB,and the input network augments the correlation between modes through channel correlation.Within the enhanced DenseNet network,the Efficient Channel Attention Network(ECA-Net)dynamically adjusts the weight of each channel to amplify the salience of crucial information in each modal feature.Depthwise separable convolution markedly reduces the training parameters and further enhances the feature correlation.Experimental evaluations were conducted on four multimodal databases,comprising six unimodal databases,including multispectral palmprint and palm vein databases from the Chinese Academy of Sciences.The Equal Error Rates(EER)values were 0.0149%,0.0150%,0.0099%,and 0.0050%,correspondingly.In comparison to other network methods for palmprint,palm vein,and finger vein fusion recognition,this approach substantially enhances recognition performance,rendering it suitable for high-security environments with practical applicability.The experiments in this article utilized amodest sample database comprising 200 individuals.The subsequent phase involves preparing for the extension of the method to larger databases. 展开更多
关键词 biometricS multi-modal CORRELATION deep learning feature-level fusion
下载PDF
A Comprehensive Survey on Deep Learning Multi-Modal Fusion:Methods,Technologies and Applications
2
作者 Tianzhe Jiao Chaopeng Guo +2 位作者 Xiaoyue Feng Yuming Chen Jie Song 《Computers, Materials & Continua》 SCIE EI 2024年第7期1-35,共35页
Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant resear... Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities.Under complex scenes,multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions.However,achieving outstanding performance is challenging because of equipment performance limitations,missing information,and data noise.This paper comprehensively reviews existing methods based onmulti-modal fusion techniques and completes a detailed and in-depth analysis.According to the data fusion stage,multi-modal fusion has four primary methods:early fusion,deep fusion,late fusion,and hybrid fusion.The paper surveys the three majormulti-modal fusion technologies that can significantly enhance the effect of data fusion and further explore the applications of multi-modal fusion technology in various fields.Finally,it discusses the challenges and explores potential research opportunities.Multi-modal tasks still need intensive study because of data heterogeneity and quality.Preserving complementary information and eliminating redundant information between modalities is critical in multi-modal technology.Invalid data fusion methods may introduce extra noise and lead to worse results.This paper provides a comprehensive and detailed summary in response to these challenges. 展开更多
关键词 multi-modal fusion REPRESENTATION TRANSLATION ALIGNMENT deep learning comparative analysis
下载PDF
Human Gait Recognition for Biometrics Application Based on Deep Learning Fusion Assisted Framework
3
作者 Ch Avais Hanif Muhammad Ali Mughal +3 位作者 Muhammad Attique Khan Nouf Abdullah Almujally Taerang Kim Jae-Hyuk Cha 《Computers, Materials & Continua》 SCIE EI 2024年第1期357-374,共18页
The demand for a non-contact biometric approach for candidate identification has grown over the past ten years.Based on the most important biometric application,human gait analysis is a significant research topic in c... The demand for a non-contact biometric approach for candidate identification has grown over the past ten years.Based on the most important biometric application,human gait analysis is a significant research topic in computer vision.Researchers have paid a lot of attention to gait recognition,specifically the identification of people based on their walking patterns,due to its potential to correctly identify people far away.Gait recognition systems have been used in a variety of applications,including security,medical examinations,identity management,and access control.These systems require a complex combination of technical,operational,and definitional considerations.The employment of gait recognition techniques and technologies has produced a number of beneficial and well-liked applications.Thiswork proposes a novel deep learning-based framework for human gait classification in video sequences.This framework’smain challenge is improving the accuracy of accuracy gait classification under varying conditions,such as carrying a bag and changing clothes.The proposed method’s first step is selecting two pre-trained deep learningmodels and training fromscratch using deep transfer learning.Next,deepmodels have been trained using static hyperparameters;however,the learning rate is calculated using the particle swarmoptimization(PSO)algorithm.Then,the best features are selected from both trained models using the Harris Hawks controlled Sine-Cosine optimization algorithm.This algorithm chooses the best features,combined in a novel correlation-based fusion technique.Finally,the fused best features are categorized using medium,bi-layer,and tri-layered neural networks.On the publicly accessible dataset known as the CASIA-B dataset,the experimental process of the suggested technique was carried out,and an improved accuracy of 94.14% was achieved.The achieved accuracy of the proposed method is improved by the recent state-of-the-art techniques that show the significance of this work. 展开更多
关键词 Gait recognition covariant factors biometric deep learning fusion feature selection
下载PDF
PowerDetector:Malicious PowerShell Script Family Classification Based on Multi-Modal Semantic Fusion and Deep Learning 被引量:1
4
作者 Xiuzhang Yang Guojun Peng +2 位作者 Dongni Zhang Yuhang Gao Chenguang Li 《China Communications》 SCIE CSCD 2023年第11期202-224,共23页
Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and ... Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and malicious detection,lacking the malicious Power Shell families classification and behavior analysis.Moreover,the state-of-the-art methods fail to capture fine-grained features and semantic relationships,resulting in low robustness and accuracy.To this end,we propose Power Detector,a novel malicious Power Shell script detector based on multimodal semantic fusion and deep learning.Specifically,we design four feature extraction methods to extract key features from character,token,abstract syntax tree(AST),and semantic knowledge graph.Then,we intelligently design four embeddings(i.e.,Char2Vec,Token2Vec,AST2Vec,and Rela2Vec) and construct a multi-modal fusion algorithm to concatenate feature vectors from different views.Finally,we propose a combined model based on transformer and CNN-Bi LSTM to implement Power Shell family detection.Our experiments with five types of Power Shell attacks show that PowerDetector can accurately detect various obfuscated and stealth PowerShell scripts,with a 0.9402 precision,a 0.9358 recall,and a 0.9374 F1-score.Furthermore,through singlemodal and multi-modal comparison experiments,we demonstrate that PowerDetector’s multi-modal embedding and deep learning model can achieve better accuracy and even identify more unknown attacks. 展开更多
关键词 deep learning malicious family detection multi-modal semantic fusion POWERSHELL
下载PDF
Fusion of Hash-Based Hard and Soft Biometrics for Enhancing Face Image Database Search and Retrieval
5
作者 Ameerah Abdullah Alshahrani Emad Sami Jaha Nahed Alowidi 《Computers, Materials & Continua》 SCIE EI 2023年第12期3489-3509,共21页
The utilization of digital picture search and retrieval has grown substantially in numerous fields for different purposes during the last decade,owing to the continuing advances in image processing and computer vision... The utilization of digital picture search and retrieval has grown substantially in numerous fields for different purposes during the last decade,owing to the continuing advances in image processing and computer vision approaches.In multiple real-life applications,for example,social media,content-based face picture retrieval is a well-invested technique for large-scale databases,where there is a significant necessity for reliable retrieval capabilities enabling quick search in a vast number of pictures.Humans widely employ faces for recognizing and identifying people.Thus,face recognition through formal or personal pictures is increasingly used in various real-life applications,such as helping crime investigators retrieve matching images from face image databases to identify victims and criminals.However,such face image retrieval becomes more challenging in large-scale databases,where traditional vision-based face analysis requires ample additional storage space than the raw face images already occupied to store extracted lengthy feature vectors and takes much longer to process and match thousands of face images.This work mainly contributes to enhancing face image retrieval performance in large-scale databases using hash codes inferred by locality-sensitive hashing(LSH)for facial hard and soft biometrics as(Hard BioHash)and(Soft BioHash),respectively,to be used as a search input for retrieving the top-k matching faces.Moreover,we propose the multi-biometric score-level fusion of both face hard and soft BioHashes(Hard-Soft BioHash Fusion)for further augmented face image retrieval.The experimental outcomes applied on the Labeled Faces in the Wild(LFW)dataset and the related attributes dataset(LFW-attributes),demonstrate that the retrieval performance of the suggested fusion approach(Hard-Soft BioHash Fusion)significantly improved the retrieval performance compared to solely using Hard BioHash or Soft BioHash in isolation,where the suggested method provides an augmented accuracy of 87%when executed on 1000 specimens and 77%on 5743 samples.These results remarkably outperform the results of the Hard BioHash method by(50%on the 1000 samples and 30%on the 5743 samples),and the Soft BioHash method by(78%on the 1000 samples and 63%on the 5743 samples). 展开更多
关键词 Face image retrieval soft biometrics similar pictures HASHING database search large databases score-level fusion multimodal fusion
下载PDF
Multi-Modal Military Event Extraction Based on Knowledge Fusion
6
作者 Yuyuan Xiang Yangli Jia +1 位作者 Xiangliang Zhang Zhenling Zhang 《Computers, Materials & Continua》 SCIE EI 2023年第10期97-114,共18页
Event extraction stands as a significant endeavor within the realm of information extraction,aspiring to automatically extract structured event information from vast volumes of unstructured text.Extracting event eleme... Event extraction stands as a significant endeavor within the realm of information extraction,aspiring to automatically extract structured event information from vast volumes of unstructured text.Extracting event elements from multi-modal data remains a challenging task due to the presence of a large number of images and overlapping event elements in the data.Although researchers have proposed various methods to accomplish this task,most existing event extraction models cannot address these challenges because they are only applicable to text scenarios.To solve the above issues,this paper proposes a multi-modal event extraction method based on knowledge fusion.Specifically,for event-type recognition,we use a meticulous pipeline approach that integrates multiple pre-trained models.This approach enables a more comprehensive capture of the multidimensional event semantic features present in military texts,thereby enhancing the interconnectedness of information between trigger words and events.For event element extraction,we propose a method for constructing a priori templates that combine event types with corresponding trigger words.This approach facilitates the acquisition of fine-grained input samples containing event trigger words,thus enabling the model to understand the semantic relationships between elements in greater depth.Furthermore,a fusion method for spatial mapping of textual event elements and image elements is proposed to reduce the category number overload and effectively achieve multi-modal knowledge fusion.The experimental results based on the CCKS 2022 dataset show that our method has achieved competitive results,with a comprehensive evaluation value F1-score of 53.4%for the model.These results validate the effectiveness of our method in extracting event elements from multi-modal data. 展开更多
关键词 Event extraction multi-modal knowledge fusion pre-trained models
下载PDF
A Novel Fusion System Based on Iris and Ear Biometrics for E-exams
7
作者 S.A.Shaban Hosnia M.M.Ahmed D.L.Elsheweikh 《Intelligent Automation & Soft Computing》 SCIE 2023年第3期3295-3315,共21页
With the rapid spread of the coronavirus epidemic all over the world,educational and other institutions are heading towards digitization.In the era of digitization,identifying educational e-platform users using ear an... With the rapid spread of the coronavirus epidemic all over the world,educational and other institutions are heading towards digitization.In the era of digitization,identifying educational e-platform users using ear and iris based multi-modal biometric systems constitutes an urgent and interesting research topic to pre-serve enterprise security,particularly with wearing a face mask as a precaution against the new coronavirus epidemic.This study proposes a multimodal system based on ear and iris biometrics at the feature fusion level to identify students in electronic examinations(E-exams)during the COVID-19 pandemic.The proposed system comprises four steps.Thefirst step is image preprocessing,which includes enhancing,segmenting,and extracting the regions of interest.The second step is feature extraction,where the Haralick texture and shape methods are used to extract the features of ear images,whereas Tamura texture and color histogram methods are used to extract the features of iris images.The third step is feature fusion,where the extracted features of the ear and iris images are combined into one sequential fused vector.The fourth step is the matching,which is executed using the City Block Dis-tance(CTB)for student identification.Thefindings of the study indicate that the system’s recognition accuracy is 97%,with a 2%False Acceptance Rate(FAR),a 4%False Rejection Rate(FRR),a 94%Correct Recognition Rate(CRR),and a 96%Genuine Acceptance Rate(GAR).In addition,the proposed recognition sys-tem achieved higher accuracy than other related systems. 展开更多
关键词 City block distance(CTB) Covid-19 ear biometric e-exams feature-level fusion iris biometric multimodal biometric student’s identity
下载PDF
Robust Symmetry Prediction with Multi-Modal Feature Fusion for Partial Shapes
8
作者 Junhua Xi Kouquan Zheng +3 位作者 Yifan Zhong Longjiang Li Zhiping Cai Jinjing Chen 《Intelligent Automation & Soft Computing》 SCIE 2023年第3期3099-3111,共13页
In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resoluti... In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resolution,single viewpoint,and occlusion.Different from the existing works predicting symmetry from the complete shape,we propose a learning approach for symmetry predic-tion based on a single RGB-D image.Instead of directly predicting the symmetry from incomplete shapes,our method consists of two modules,i.e.,the multi-mod-al feature fusion module and the detection-by-reconstruction module.Firstly,we build a channel-transformer network(CTN)to extract cross-fusion features from the RGB-D as the multi-modal feature fusion module,which helps us aggregate features from the color and the depth separately.Then,our self-reconstruction net-work based on a 3D variational auto-encoder(3D-VAE)takes the global geo-metric features as input,followed by a prediction symmetry network to detect the symmetry.Our experiments are conducted on three public datasets:ShapeNet,YCB,and ScanNet,we demonstrate that our method can produce reliable and accurate results. 展开更多
关键词 Symmetry prediction multi-modal feature fusion partial shapes
下载PDF
Fine-Grained Soft Ear Biometrics for Augmenting Human Recognition
9
作者 Ghoroub Talal Bostaji Emad Sami Jaha 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1571-1591,共21页
Human recognition technology based on biometrics has become a fundamental requirement in all aspects of life due to increased concerns about security and privacy issues.Therefore,biometric systems have emerged as a te... Human recognition technology based on biometrics has become a fundamental requirement in all aspects of life due to increased concerns about security and privacy issues.Therefore,biometric systems have emerged as a technology with the capability to identify or authenticate individuals based on their physiological and behavioral characteristics.Among different viable biometric modalities,the human ear structure can offer unique and valuable discriminative characteristics for human recognition systems.In recent years,most existing traditional ear recognition systems have been designed based on computer vision models and have achieved successful results.Nevertheless,such traditional models can be sensitive to several unconstrained environmental factors.As such,some traits may be difficult to extract automatically but can still be semantically perceived as soft biometrics.This research proposes a new group of semantic features to be used as soft ear biometrics,mainly inspired by conventional descriptive traits used naturally by humans when identifying or describing each other.Hence,the research study is focused on the fusion of the soft ear biometric traits with traditional(hard)ear biometric features to investigate their validity and efficacy in augmenting human identification performance.The proposed framework has two subsystems:first,a computer vision-based subsystem,extracting traditional(hard)ear biometric traits using principal component analysis(PCA)and local binary patterns(LBP),and second,a crowdsourcing-based subsystem,deriving semantic(soft)ear biometric traits.Several feature-level fusion experiments were conducted using the AMI database to evaluate the proposed algorithm’s performance.The obtained results for both identification and verification showed that the proposed soft ear biometric information significantly improved the recognition performance of traditional ear biometrics,reaching up to 12%for LBP and 5%for PCA descriptors;when fusing all three capacities PCA,LBP,and soft traits using k-nearest neighbors(KNN)classifier. 展开更多
关键词 Ear biometrics soft biometrics human ear recognition semantic features feature-level fusion computer vision machine learning
下载PDF
Adaptive Multi-modal Fusion Instance Segmentation for CAEVs in Complex Conditions:Dataset,Framework and Verifications 被引量:2
10
作者 Pai Peng Keke Geng +3 位作者 Guodong Yin Yanbo Lu Weichao Zhuang Shuaipeng Liu 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2021年第5期96-106,共11页
Current works of environmental perception for connected autonomous electrified vehicles(CAEVs)mainly focus on the object detection task in good weather and illumination conditions,they often perform poorly in adverse ... Current works of environmental perception for connected autonomous electrified vehicles(CAEVs)mainly focus on the object detection task in good weather and illumination conditions,they often perform poorly in adverse scenarios and have a vague scene parsing ability.This paper aims to develop an end-to-end sharpening mixture of experts(SMoE)fusion framework to improve the robustness and accuracy of the perception systems for CAEVs in complex illumination and weather conditions.Three original contributions make our work distinctive from the existing relevant literature.The Complex KITTI dataset is introduced which consists of 7481 pairs of modified KITTI RGB images and the generated LiDAR dense depth maps,and this dataset is fine annotated in instance-level with the proposed semi-automatic annotation method.The SMoE fusion approach is devised to adaptively learn the robust kernels from complementary modalities.Comprehensive comparative experiments are implemented,and the results show that the proposed SMoE framework yield significant improvements over the other fusion techniques in adverse environmental conditions.This research proposes a SMoE fusion framework to improve the scene parsing ability of the perception systems for CAEVs in adverse conditions. 展开更多
关键词 Connected autonomous electrified vehicles multi-modal fusion Semi-automatic annotation Sharpening mixture of experts Comparative experiments
下载PDF
Dynamic Audio-Visual Biometric Fusion for Person Recognition 被引量:1
11
作者 Najlaa Hindi Alsaedi Emad Sami Jaha 《Computers, Materials & Continua》 SCIE EI 2022年第4期1283-1311,共29页
Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recogni... Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recognition tasks separately as in unimodal systems,or jointly with two or more as in multimodal systems.However,multimodal systems can usually enhance the recognition performance over unimodal systems by integrating the biometric data of multiple modalities at different fusion levels.Despite this enhancement,in real-life applications some factors degrade multimodal systems’performance,such as occlusion,face poses,and noise in voice data.In this paper,we propose two algorithms that effectively apply dynamic fusion at feature level based on the data quality of multimodal biometrics.The proposed algorithms attempt to minimize the negative influence of confusing and low-quality features by either exclusion or weight reduction to achieve better recognition performance.The proposed dynamic fusion was achieved using face and voice biometrics,where face features were extracted using principal component analysis(PCA),and Gabor filters separately,whilst voice features were extracted using Mel-Frequency Cepstral Coefficients(MFCCs).Here,the facial data quality assessment of face images is mainly based on the existence of occlusion,whereas the assessment of voice data quality is substantially based on the calculation of signal to noise ratio(SNR)as per the existence of noise.To evaluate the performance of the proposed algorithms,several experiments were conducted using two combinations of three different databases,AR database,and the extended Yale Face Database B for face images,in addition to VOiCES database for voice data.The obtained results show that both proposed dynamic fusion algorithms attain improved performance and offer more advantages in identification and verification over not only the standard unimodal algorithms but also the multimodal algorithms using standard fusion methods. 展开更多
关键词 biometricS dynamic fusion feature fusion identification multimodal biometrics occluded face recognition quality-based recognition verification voice recognition
下载PDF
Method of Multi-Mode Sensor Data Fusion with an Adaptive Deep Coupling Convolutional Auto-Encoder
12
作者 Xiaoxiong Feng Jianhua Liu 《Journal of Sensor Technology》 2023年第4期69-85,共17页
To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features e... To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features extracted synchronously by the CCAE were stacked and fed to the multi-channel convolution layers for fusion. Then, the fused data was passed to all connection layers for compression and fed to the Softmax module for classification. Finally, the coupling loss function coefficients and the network parameters were optimized through an adaptive approach using the gray wolf optimization (GWO) algorithm. Experimental comparisons showed that the proposed ADCCAE fusion model was superior to existing models for multi-mode data fusion. 展开更多
关键词 multi-mode Data fusion Coupling Convolutional Auto-Encoder Adaptive Optimization Deep Learning
下载PDF
Neural Network Based Normalized Fusion Approaches for Optimized Multimodal Biometric Authentication Algorithm 被引量:2
13
作者 E. Sujatha A. Chilambuchelvan 《Circuits and Systems》 2016年第8期1199-1206,共8页
A multimodal biometric system is applied to recognize individuals for authentication using neural networks. In this paper multimodal biometric algorithm is designed by integrating iris, finger vein, palm print and fac... A multimodal biometric system is applied to recognize individuals for authentication using neural networks. In this paper multimodal biometric algorithm is designed by integrating iris, finger vein, palm print and face biometric traits. Normalized score level fusion approach is applied and optimized, encoded for matching decision. It is a multilevel wavelet, phase based fusion algorithm. This robust multimodal biometric algorithm increases the security level, accuracy, reduces memory size and equal error rate and eliminates unimodal biometric algorithm vulnerabilities. 展开更多
关键词 Multimodal biometrics Score Level fusion Approach Neural Network OPTIMIZATION
下载PDF
Fake News Detection Based on Cross-Modal Message Aggregation and Gated Fusion Network
14
作者 Fangfang Shan Mengyao Liu +1 位作者 Menghan Zhang Zhenyu Wang 《Computers, Materials & Continua》 SCIE EI 2024年第7期1521-1542,共22页
Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion... Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models. 展开更多
关键词 Fake news detection cross-modalmessage aggregation gate fusion network co-attention mechanism multi-modal representation
下载PDF
Fake News Detection Based on Text-Modal Dominance and Fusing Multiple Multi-Model Clues
15
作者 Li fang Fu Huanxin Peng +1 位作者 Changjin Ma Yuhan Liu 《Computers, Materials & Continua》 SCIE EI 2024年第3期4399-4416,共18页
In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure in... In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical.Unfortunately,existing approaches fail to handle these problems.This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues(TD-MMC),which utilizes three valuable multi-model clues:text-model importance,text-image complementary,and text-image inconsistency.TD-MMC is dominated by textural content and assisted by image information while using social network information to enhance text representation.To reduce the irrelevant social structure’s information interference,we use a unidirectional cross-modal attention mechanism to selectively learn the social structure’s features.A cross-modal attention mechanism is adopted to obtain text-image cross-modal features while retaining textual features to reduce the loss of important information.In addition,TD-MMC employs a new multi-model loss to improve the model’s generalization ability.Extensive experiments have been conducted on two public real-world English and Chinese datasets,and the results show that our proposed model outperforms the state-of-the-art methods on classification evaluation metrics. 展开更多
关键词 Fake news detection cross-modal attention mechanism multi-modal fusion social network transfer learning
下载PDF
Enhanced Biometric Score Fusion Scheme Based on the AdaBoost Algorithm
16
作者 Wei-Yang Lin Chih-Yang Lin Chuan-Jheng Yang 《Journal of Electronic Science and Technology》 CAS CSCD 2017年第2期187-193,共7页
Information fusion in biometric systems, either multimodal or intramodal fusion, usually provides an improvement in recognition performance. This paper presents an improved score-level fusion scheme called boosted sco... Information fusion in biometric systems, either multimodal or intramodal fusion, usually provides an improvement in recognition performance. This paper presents an improved score-level fusion scheme called boosted score fusion. The proposed framework is a two-stage design where an existing fusion algorithm is adopted at the first stage. At the second stage, the weights obtained by the AdaBoost algorithm are utilized to boost the performance of the previously fused results. The experimental results demonstrate that the performance of several score-level fusion methods can be improved by using the presented method. 展开更多
关键词 Index Terms-AdaBoost biometric authentication face recognition score-level fusion.
下载PDF
Test method of laser paint removal based on multi-modal feature fusion
17
作者 HUANG Hai-peng HAO Ben-tian +2 位作者 YE De-jun GAO Hao LI Liang 《Journal of Central South University》 SCIE EI CAS CSCD 2022年第10期3385-3398,共14页
Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion net... Laser cleaning is a highly nonlinear physical process for solving poor single-modal(e.g., acoustic or vision)detection performance and low inter-information utilization. In this study, a multi-modal feature fusion network model was constructed based on a laser paint removal experiment. The alignment of heterogeneous data under different modals was solved by combining the piecewise aggregate approximation and gramian angular field. Moreover, the attention mechanism was introduced to optimize the dual-path network and dense connection network, enabling the sampling characteristics to be extracted and integrated. Consequently, the multi-modal discriminant detection of laser paint removal was realized. According to the experimental results, the verification accuracy of the constructed model on the experimental dataset was 99.17%, which is 5.77% higher than the optimal single-modal detection results of the laser paint removal. The feature extraction network was optimized by the attention mechanism, and the model accuracy was increased by 3.3%. Results verify the improved classification performance of the constructed multi-modal feature fusion model in detecting laser paint removal, the effective integration of acoustic data and visual image data, and the accurate detection of laser paint removal. 展开更多
关键词 laser cleaning multi-modal fusion image processing deep learning
下载PDF
Adaptive multi-modal feature fusion for far and hard object detection
18
作者 LI Yang GE Hongwei 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2021年第2期232-241,共10页
In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is pro... In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is proposed,which makes use of multi-neighborhood information of voxel and image information.Firstly,design an improved ResNet that maintains the structure information of far and hard objects in low-resolution feature maps,which is more suitable for detection task.Meanwhile,semantema of each image feature map is enhanced by semantic information from all subsequent feature maps.Secondly,extract multi-neighborhood context information with different receptive field sizes to make up for the defect of sparseness of point cloud which improves the ability of voxel features to represent the spatial structure and semantic information of objects.Finally,propose a multi-modal feature adaptive fusion strategy which uses learnable weights to express the contribution of different modal features to the detection task,and voxel attention further enhances the fused feature expression of effective target objects.The experimental results on the KITTI benchmark show that this method outperforms VoxelNet with remarkable margins,i.e.increasing the AP by 8.78%and 5.49%on medium and hard difficulty levels.Meanwhile,our method achieves greater detection performance compared with many mainstream multi-modal methods,i.e.outperforming the AP by 1%compared with that of MVX-Net on medium and hard difficulty levels. 展开更多
关键词 3D object detection adaptive fusion multi-modal data fusion attention mechanism multi-neighborhood features
下载PDF
Adaptive cross-fusion learning for multi-modal gesture recognition
19
作者 Benjia ZHOU Jun WAN +1 位作者 Yanyan LIANG Guodong GUO 《Virtual Reality & Intelligent Hardware》 2021年第3期235-247,共13页
Background Gesture recognition has attracted significant attention because of its wide range of potential applications.Although multi-modal gesture recognition has made significant progress in recent years,a popular m... Background Gesture recognition has attracted significant attention because of its wide range of potential applications.Although multi-modal gesture recognition has made significant progress in recent years,a popular method still is simply fusing prediction scores at the end of each branch,which often ignores complementary features among different modalities in the early stage and does not fuse the complementary features into a more discriminative feature.Methods This paper proposes an Adaptive Cross-modal Weighting(ACmW)scheme to exploit complementarity features from RGB-D data in this study.The scheme learns relations among different modalities by combining the features of different data streams.The proposed ACmW module contains two key functions:(1)fusing complementary features from multiple streams through an adaptive one-dimensional convolution;and(2)modeling the correlation of multi-stream complementary features in the time dimension.Through the effective combination of these two functional modules,the proposed ACmW can automatically analyze the relationship between the complementary features from different streams,and can fuse them in the spatial and temporal dimensions.Results Extensive experiments validate the effectiveness of the proposed method,and show that our method outperforms state-of-the-art methods on IsoGD and NVGesture. 展开更多
关键词 Gesture recognition multi-modal fusion RGB-D
下载PDF
Multi-modality hierarchical fusion network for lumbar spine segmentation with magnetic resonance images
20
作者 Han Yan Guangtao Zhang +1 位作者 Wei Cui Zhuliang Yu 《Control Theory and Technology》 EI CSCD 2024年第4期612-622,共11页
For the analysis of spinal and disc diseases,automated tissue segmentation of the lumbar spine is vital.Due to the continuous and concentrated location of the target,the abundance of edge features,and individual diffe... For the analysis of spinal and disc diseases,automated tissue segmentation of the lumbar spine is vital.Due to the continuous and concentrated location of the target,the abundance of edge features,and individual differences,conventional automatic segmentation methods perform poorly.Since the success of deep learning in the segmentation of medical images has been shown in the past few years,it has been applied to this task in a number of ways.The multi-scale and multi-modal features of lumbar tissues,however,are rarely explored by methodologies of deep learning.Because of the inadequacies in medical images availability,it is crucial to effectively fuse various modes of data collection for model training to alleviate the problem of insufficient samples.In this paper,we propose a novel multi-modality hierarchical fusion network(MHFN)for improving lumbar spine segmentation by learning robust feature representations from multi-modality magnetic resonance images.An adaptive group fusion module(AGFM)is introduced in this paper to fuse features from various modes to extract cross-modality features that could be valuable.Furthermore,to combine features from low to high levels of cross-modality,we design a hierarchical fusion structure based on AGFM.Compared to the other feature fusion methods,AGFM is more effective based on experimental results on multi-modality MR images of the lumbar spine.To further enhance segmentation accuracy,we compare our network with baseline fusion structures.Compared to the baseline fusion structures(input-level:76.27%,layer-level:78.10%,decision-level:79.14%),our network was able to segment fractured vertebrae more accurately(85.05%). 展开更多
关键词 Lumbar spine segmentation Deep learning multi-modality fusion Feature fusion
原文传递
上一页 1 2 8 下一页 到第
使用帮助 返回顶部