期刊文献+
共找到692篇文章
< 1 2 35 >
每页显示 20 50 100
Multimodal fusion recognition for digital twin
1
作者 Tianzhe Zhou Xuguang Zhang +1 位作者 Bing Kang Mingkai Chen 《Digital Communications and Networks》 SCIE CSCD 2024年第2期337-346,共10页
The digital twin is the concept of transcending reality,which is the reverse feedback from the real physical space to the virtual digital space.People hold great prospects for this emerging technology.In order to real... The digital twin is the concept of transcending reality,which is the reverse feedback from the real physical space to the virtual digital space.People hold great prospects for this emerging technology.In order to realize the upgrading of the digital twin industrial chain,it is urgent to introduce more modalities,such as vision,haptics,hearing and smell,into the virtual digital space,which assists physical entities and virtual objects in creating a closer connection.Therefore,perceptual understanding and object recognition have become an urgent hot topic in the digital twin.Existing surface material classification schemes often achieve recognition through machine learning or deep learning in a single modality,ignoring the complementarity between multiple modalities.In order to overcome this dilemma,we propose a multimodal fusion network in our article that combines two modalities,visual and haptic,for surface material recognition.On the one hand,the network makes full use of the potential correlations between multiple modalities to deeply mine the modal semantics and complete the data mapping.On the other hand,the network is extensible and can be used as a universal architecture to include more modalities.Experiments show that the constructed multimodal fusion network can achieve 99.42%classification accuracy while reducing complexity. 展开更多
关键词 Digital twin multimodal fusion Object recognition Deep learning Transfer learning
下载PDF
FusionNN:A Semantic Feature Fusion Model Based on Multimodal for Web Anomaly Detection
2
作者 Li Wang Mingshan Xia +3 位作者 Hao Hu Jianfang Li Fengyao Hou Gang Chen 《Computers, Materials & Continua》 SCIE EI 2024年第5期2991-3006,共16页
With the rapid development of the mobile communication and the Internet,the previous web anomaly detectionand identificationmodels were built relying on security experts’empirical knowledge and attack features.Althou... With the rapid development of the mobile communication and the Internet,the previous web anomaly detectionand identificationmodels were built relying on security experts’empirical knowledge and attack features.Althoughthis approach can achieve higher detection performance,it requires huge human labor and resources to maintainthe feature library.In contrast,semantic feature engineering can dynamically discover new semantic featuresand optimize feature selection by automatically analyzing the semantic information contained in the data itself,thus reducing dependence on prior knowledge.However,current semantic features still have the problem ofsemantic expression singularity,as they are extracted from a single semantic mode such as word segmentation,character segmentation,or arbitrary semantic feature extraction.This paper extracts features of web requestsfrom dual semantic granularity,and proposes a semantic feature fusion method to solve the above problems.Themethod first preprocesses web requests,and extracts word-level and character-level semantic features of URLs viaconvolutional neural network(CNN),respectively.By constructing three loss functions to reduce losses betweenfeatures,labels and categories.Experiments on the HTTP CSIC 2010,Malicious URLs and HttpParams datasetsverify the proposedmethod.Results show that compared withmachine learning,deep learningmethods and BERTmodel,the proposed method has better detection performance.And it achieved the best detection rate of 99.16%in the dataset HttpParams. 展开更多
关键词 Feature fusion web anomaly detection multimodal convolutional neural network(CNN) semantic feature extraction
下载PDF
A deep multimodal fusion and multitasking trajectory prediction model for typhoon trajectory prediction to reduce flight scheduling cancellation
3
作者 TANG Jun QIN Wanting +1 位作者 PAN Qingtao LAO Songyang 《Journal of Systems Engineering and Electronics》 SCIE CSCD 2024年第3期666-678,共13页
Natural events have had a significant impact on overall flight activity,and the aviation industry plays a vital role in helping society cope with the impact of these events.As one of the most impactful weather typhoon... Natural events have had a significant impact on overall flight activity,and the aviation industry plays a vital role in helping society cope with the impact of these events.As one of the most impactful weather typhoon seasons appears and continues,airlines operating in threatened areas and passengers having travel plans during this time period will pay close attention to the development of tropical storms.This paper proposes a deep multimodal fusion and multitasking trajectory prediction model that can improve the reliability of typhoon trajectory prediction and reduce the quantity of flight scheduling cancellation.The deep multimodal fusion module is formed by deep fusion of the feature output by multiple submodal fusion modules,and the multitask generation module uses longitude and latitude as two related tasks for simultaneous prediction.With more dependable data accuracy,problems can be analysed rapidly and more efficiently,enabling better decision-making with a proactive versus reactive posture.When multiple modalities coexist,features can be extracted from them simultaneously to supplement each other’s information.An actual case study,the typhoon Lichma that swept China in 2019,has demonstrated that the algorithm can effectively reduce the number of unnecessary flight cancellations compared to existing flight scheduling and assist the new generation of flight scheduling systems under extreme weather. 展开更多
关键词 flight scheduling optimization deep multimodal fusion multitasking trajectory prediction typhoon weather flight cancellation prediction reliability
下载PDF
Multimodal Medical Image Fusion Based on Parameter Adaptive PCNN and Latent Low-rank Representation 被引量:1
4
作者 WANG Wenyan ZHOU Xianchun YANG Liangjian 《Instrumentation》 2023年第1期45-58,共14页
Medical image fusion has been developed as an efficient assistive technology in various clinical applications such as medical diagnosis and treatment planning.Aiming at the problem of insufficient protection of image ... Medical image fusion has been developed as an efficient assistive technology in various clinical applications such as medical diagnosis and treatment planning.Aiming at the problem of insufficient protection of image contour and detail information by traditional image fusion methods,a new multimodal medical image fusion method is proposed.This method first uses non-subsampled shearlet transform to decompose the source image to obtain high and low frequency subband coefficients,then uses the latent low rank representation algorithm to fuse the low frequency subband coefficients,and applies the improved PAPCNN algorithm to fuse the high frequency subband coefficients.Finally,based on the automatic setting of parameters,the optimization method configuration of the time decay factorαe is carried out.The experimental results show that the proposed method solves the problems of difficult parameter setting and insufficient detail protection ability in traditional PCNN algorithm fusion images,and at the same time,it has achieved great improvement in visual quality and objective evaluation indicators. 展开更多
关键词 Image fusion Non-subsampled Shearlet Transform Parameter Adaptive PCNN Latent low-rank Representation
下载PDF
Multimodal Social Media Fake News Detection Based on Similarity Inference and Adversarial Networks 被引量:1
5
作者 Fangfang Shan Huifang Sun Mengyi Wang 《Computers, Materials & Continua》 SCIE EI 2024年第4期581-605,共25页
As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocrea... As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocreate a misleading perception among users. While early research primarily focused on text-based features forfake news detection mechanisms, there has been relatively limited exploration of learning shared representationsin multimodal (text and visual) contexts. To address these limitations, this paper introduces a multimodal modelfor detecting fake news, which relies on similarity reasoning and adversarial networks. The model employsBidirectional Encoder Representation from Transformers (BERT) and Text Convolutional Neural Network (Text-CNN) for extracting textual features while utilizing the pre-trained Visual Geometry Group 19-layer (VGG-19) toextract visual features. Subsequently, the model establishes similarity representations between the textual featuresextracted by Text-CNN and visual features through similarity learning and reasoning. Finally, these features arefused to enhance the accuracy of fake news detection, and adversarial networks have been employed to investigatethe relationship between fake news and events. This paper validates the proposed model using publicly availablemultimodal datasets from Weibo and Twitter. Experimental results demonstrate that our proposed approachachieves superior performance on Twitter, with an accuracy of 86%, surpassing traditional unimodalmodalmodelsand existing multimodal models. In contrast, the overall better performance of our model on the Weibo datasetsurpasses the benchmark models across multiple metrics. The application of similarity reasoning and adversarialnetworks in multimodal fake news detection significantly enhances detection effectiveness in this paper. However,current research is limited to the fusion of only text and image modalities. Future research directions should aimto further integrate features fromadditionalmodalities to comprehensively represent themultifaceted informationof fake news. 展开更多
关键词 Fake news detection attention mechanism image-text similarity multimodal feature fusion
下载PDF
Conditional selection with CNN augmented transformer for multimodal affective analysis
6
作者 Jianwen Wang Shiping Wang +3 位作者 Shunxin Xiao Renjie Lin Mianxiong Dong Wenzhong Guo 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第4期917-931,共15页
Attention mechanism has been a successful method for multimodal affective analysis in recent years. Despite the advances, several significant challenges remain in fusing language and its nonverbal context information.... Attention mechanism has been a successful method for multimodal affective analysis in recent years. Despite the advances, several significant challenges remain in fusing language and its nonverbal context information. One is to generate sparse attention coefficients associated with acoustic and visual modalities, which helps locate critical emotional se-mantics. The other is fusing complementary cross‐modal representation to construct optimal salient feature combinations of multiple modalities. A Conditional Transformer Fusion Network is proposed to handle these problems. Firstly, the authors equip the transformer module with CNN layers to enhance the detection of subtle signal patterns in nonverbal sequences. Secondly, sentiment words are utilised as context conditions to guide the computation of cross‐modal attention. As a result, the located nonverbal fea-tures are not only salient but also complementary to sentiment words directly. Experi-mental results show that the authors’ method achieves state‐of‐the‐art performance on several multimodal affective analysis datasets. 展开更多
关键词 affective computing data fusion information fusion multimodal approaches
下载PDF
Interactive System for Video Summarization Based on Multimodal Fusion 被引量:1
7
作者 Zheng Li Xiaobing Du +2 位作者 Cuixia Ma Yanfeng Li Hongan Wang 《Journal of Beijing Institute of Technology》 EI CAS 2019年第1期27-34,共8页
Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is ... Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is proposed,which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality.In general,a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie.In this paper,JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the characters' whole life.In terms of fusing keywords and key-frames,affinity propagation is adopted to calculate the similarity between each key-frame cluster and keywords.Through the method mentioned above,a video summarization is presented based on multimodal fusion which describes video content more completely.In order to reduce the time spent on searching the interest video content and get the relationship between main characters,a kind of map is adopted to visualize video content and interact with video summarization.An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently. 展开更多
关键词 VIDEO VISUALIZATION INTERACTION multimodal fusion VIDEO SUMMARIZATION
下载PDF
Multimodal Medical Image Registration and Fusion for Quality Enhancement 被引量:2
8
作者 Muhammad Adeel Azam Khan Bahadar Khan +1 位作者 Muhammad Ahmad Manuel Mazzara 《Computers, Materials & Continua》 SCIE EI 2021年第7期821-840,共20页
For the last two decades,physicians and clinical experts have used a single imaging modality to identify the normal and abnormal structure of the human body.However,most of the time,medical experts are unable to accur... For the last two decades,physicians and clinical experts have used a single imaging modality to identify the normal and abnormal structure of the human body.However,most of the time,medical experts are unable to accurately analyze and examine the information from a single imaging modality due to the limited information.To overcome this problem,a multimodal approach is adopted to increase the qualitative and quantitative medical information which helps the doctors to easily diagnose diseases in their early stages.In the proposed method,a Multi-resolution Rigid Registration(MRR)technique is used for multimodal image registration while Discrete Wavelet Transform(DWT)along with Principal Component Averaging(PCAv)is utilized for image fusion.The proposed MRR method provides more accurate results as compared with Single Rigid Registration(SRR),while the proposed DWT-PCAv fusion process adds-on more constructive information with less computational time.The proposed method is tested on CT and MRI brain imaging modalities of the HARVARD dataset.The fusion results of the proposed method are compared with the existing fusion techniques.The quality assessment metrics such as Mutual Information(MI),Normalize Crosscorrelation(NCC)and Feature Mutual Information(FMI)are computed for statistical comparison of the proposed method.The proposed methodology provides more accurate results,better image quality and valuable information for medical diagnoses. 展开更多
关键词 multimodal REGISTRATION fusion multi-resolution rigid registration discrete wavelet transform principle component averaging
下载PDF
Neural Network Based Normalized Fusion Approaches for Optimized Multimodal Biometric Authentication Algorithm 被引量:2
9
作者 E. Sujatha A. Chilambuchelvan 《Circuits and Systems》 2016年第8期1199-1206,共8页
A multimodal biometric system is applied to recognize individuals for authentication using neural networks. In this paper multimodal biometric algorithm is designed by integrating iris, finger vein, palm print and fac... A multimodal biometric system is applied to recognize individuals for authentication using neural networks. In this paper multimodal biometric algorithm is designed by integrating iris, finger vein, palm print and face biometric traits. Normalized score level fusion approach is applied and optimized, encoded for matching decision. It is a multilevel wavelet, phase based fusion algorithm. This robust multimodal biometric algorithm increases the security level, accuracy, reduces memory size and equal error rate and eliminates unimodal biometric algorithm vulnerabilities. 展开更多
关键词 multimodal Biometrics Score Level fusion Approach Neural Network OPTIMIZATION
下载PDF
Multi-Modal Medical Image Fusion Based on Improved Parameter Adaptive PCNN and Latent Low-Rank Representation
10
作者 Zirui Tang Xianchun Zhou 《Instrumentation》 2024年第2期53-63,共11页
Multimodal medical image fusion can help physicians provide more accurate treatment plans for patients, as unimodal images provide limited valid information. To address the insufficient ability of traditional medical ... Multimodal medical image fusion can help physicians provide more accurate treatment plans for patients, as unimodal images provide limited valid information. To address the insufficient ability of traditional medical image fusion solutions to protect image details and significant information, a new multimodality medical image fusion method(NSST-PAPCNNLatLRR) is proposed in this paper. Firstly, the high and low-frequency sub-band coefficients are obtained by decomposing the source image using NSST. Then, the latent low-rank representation algorithm is used to process the low-frequency sub-band coefficients;An improved PAPCNN algorithm is also proposed for the fusion of high-frequency sub-band coefficients. The improved PAPCNN model was based on the automatic setting of the parameters, and the optimal method was configured for the time decay factor αe. The experimental results show that, in comparison with the five mainstream fusion algorithms, the new algorithm has significantly improved the visual effect over the comparison algorithm,enhanced the ability to characterize important information in images, and further improved the ability to protect the detailed information;the new algorithm has achieved at least four firsts in six objective indexes. 展开更多
关键词 image fusion improved parameter adaptive pcnn non-subsampled shear-wave transform latent low-rank representation
下载PDF
3D Vehicle Detection Algorithm Based onMultimodal Decision-Level Fusion
11
作者 Peicheng Shi Heng Qi +1 位作者 Zhiqiang Liu Aixi Yang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第6期2007-2023,共17页
3D vehicle detection based on LiDAR-camera fusion is becoming an emerging research topic in autonomous driving.The algorithm based on the Camera-LiDAR object candidate fusion method(CLOCs)is currently considered to be... 3D vehicle detection based on LiDAR-camera fusion is becoming an emerging research topic in autonomous driving.The algorithm based on the Camera-LiDAR object candidate fusion method(CLOCs)is currently considered to be a more effective decision-level fusion algorithm,but it does not fully utilize the extracted features of 3D and 2D.Therefore,we proposed a 3D vehicle detection algorithm based onmultimodal decision-level fusion.First,project the anchor point of the 3D detection bounding box into the 2D image,calculate the distance between 2D and 3D anchor points,and use this distance as a new fusion feature to enhance the feature redundancy of the network.Subsequently,add an attention module:squeeze-and-excitation networks,weight each feature channel to enhance the important features of the network,and suppress useless features.The experimental results show that the mean average precision of the algorithm in the KITTI dataset is 82.96%,which outperforms previous state-ofthe-art multimodal fusion-based methods,and the average accuracy in the Easy,Moderate and Hard evaluation indicators reaches 88.96%,82.60%,and 77.31%,respectively,which are higher compared to the original CLOCs model by 1.02%,2.29%,and 0.41%,respectively.Compared with the original CLOCs algorithm,our algorithm has higher accuracy and better performance in 3D vehicle detection. 展开更多
关键词 3D vehicle detection multimodal fusion CLOCs network structure optimization attention module
下载PDF
MFF-Net: Multimodal Feature Fusion Network for 3D Object Detection
12
作者 Peicheng Shi Zhiqiang Liu +1 位作者 Heng Qi Aixi Yang 《Computers, Materials & Continua》 SCIE EI 2023年第6期5615-5637,共23页
In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection ... In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection will be affected by problems such as illumination changes,object occlusion,and object detection distance.To this purpose,we face these challenges by proposing a multimodal feature fusion network for 3D object detection(MFF-Net).In this research,this paper first uses the spatial transformation projection algorithm to map the image features into the feature space,so that the image features are in the same spatial dimension when fused with the point cloud features.Then,feature channel weighting is performed using an adaptive expression augmentation fusion network to enhance important network features,suppress useless features,and increase the directionality of the network to features.Finally,this paper increases the probability of false detection and missed detection in the non-maximum suppression algo-rithm by increasing the one-dimensional threshold.So far,this paper has constructed a complete 3D target detection network based on multimodal feature fusion.The experimental results show that the proposed achieves an average accuracy of 82.60%on the Karlsruhe Institute of Technology and Toyota Technological Institute(KITTI)dataset,outperforming previous state-of-the-art multimodal fusion networks.In Easy,Moderate,and hard evaluation indicators,the accuracy rate of this paper reaches 90.96%,81.46%,and 75.39%.This shows that the MFF-Net network has good performance in 3D object detection. 展开更多
关键词 3D object detection multimodal fusion neural network autonomous driving attention mechanism
下载PDF
Fusion of color and hallucinated depth features for enhanced multimodal deep learning-based damage segmentation
13
作者 Tarutal Ghosh Mondal Mohammad Reza Jahanshahi 《Earthquake Engineering and Engineering Vibration》 SCIE EI CSCD 2023年第1期55-68,共14页
Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside th... Recent advances in computer vision and deep learning have shown that the fusion of depth information can significantly enhance the performance of RGB-based damage detection and segmentation models.However,alongside the advantages,depth-sensing also presents many practical challenges.For instance,the depth sensors impose an additional payload burden on the robotic inspection platforms limiting the operation time and increasing the inspection cost.Additionally,some lidar-based depth sensors have poor outdoor performance due to sunlight contamination during the daytime.In this context,this study investigates the feasibility of abolishing depth-sensing at test time without compromising the segmentation performance.An autonomous damage segmentation framework is developed,based on recent advancements in vision-based multi-modal sensing such as modality hallucination(MH)and monocular depth estimation(MDE),which require depth data only during the model training.At the time of deployment,depth data becomes expendable as it can be simulated from the corresponding RGB frames.This makes it possible to reap the benefits of depth fusion without any depth perception per se.This study explored two different depth encoding techniques and three different fusion strategies in addition to a baseline RGB-based model.The proposed approach is validated on computer-generated RGB-D data of reinforced concrete buildings subjected to seismic damage.It was observed that the surrogate techniques can increase the segmentation IoU by up to 20.1%with a negligible increase in the computation cost.Overall,this study is believed to make a positive contribution to enhancing the resilience of critical civil infrastructure. 展开更多
关键词 multimodal data fusion depth sensing vision-based inspection UAV-assisted inspection damage segmentation post-disaster reconnaissance modality hallucination monocular depth estimation
下载PDF
A novel image fusion algorithm based on 2D scale-mixing complex wavelet transform and Bayesian MAP estimation for multimodal medical images
14
作者 Abdallah Bengueddoudj Zoubeida Messali Volodymyr Mosorov 《Journal of Innovative Optical Health Sciences》 SCIE EI CAS 2017年第3期52-68,共17页
In this paper,we propose a new image fusion algorithm based on two-dimensional Scale-Mixing Complex Wavelet Transform(2D-SMCWT).The fusion of the detail 2D-SMCWT cofficients is performed via a Bayesian Maximum a Poste... In this paper,we propose a new image fusion algorithm based on two-dimensional Scale-Mixing Complex Wavelet Transform(2D-SMCWT).The fusion of the detail 2D-SMCWT cofficients is performed via a Bayesian Maximum a Posteriori(MAP)approach by considering a trivariate statistical model for the local neighboring of 2D-SMCWT coefficients.For the approx imation coefficients,a new fusion rule based on the Principal Component Analysis(PCA)is applied.We conduct several experiments using three different groups of multimodal medical images to evaluate the performance of the proposed method.The obt ained results prove the superiority of the proposed method over the state of the art fusion methods in terms of visual quality and several commonly used metrics.Robustness of the proposed method is further tested against different types of noise.The plots of fusion met rics establish the accuracy of the proposed fusion method. 展开更多
关键词 Medical imaging multimodal medical image fusion scale mixing complex wavelet transform MAP Bayes estimation principal component analysis.
下载PDF
Multimodal Medical Image Fusion Methods Based on Improved Discrete Wavelet Transform
15
作者 XU Lei TIAN Shu-chang +4 位作者 CUI Can MENG Qing-le YANG Rui JIANG Hong-bing WANG Feng 《中国医疗设备》 2016年第6期1-6,共6页
Objective This paper proposed a novel algorithm of discrete wavelet transform(DWT) which is used for multimodal medical image fusion. Methods The source medical images are initially transformed by DWT followed by fusi... Objective This paper proposed a novel algorithm of discrete wavelet transform(DWT) which is used for multimodal medical image fusion. Methods The source medical images are initially transformed by DWT followed by fusing low and high frequency sub-images. Then, the "coefficient absolute value" that can provide clear and detail parts is adapted to fuse high-frequency coefficients, where as the "region energy ratio" which can efficiently preserve most information of source images is employed to fuse low-frequency coefficients. Finally, the fused image is reconstructed by inverse wavelet transform. Results Visually and quantitatively experimental results indicate that the proposed fusion method is superior to traditional wavelet transform and the existing fusion methods. Conclusion The proposed method is a feasible approach for multimodal medical image fusion which can obtain more efficient and accurate fusions results even in the noise environment. 展开更多
关键词 医疗设备维修模式 临床医学工程 医疗技术管理 中国医师协会 世界卫生组织 医学工程领域 医疗技术评估 临床工程师
下载PDF
Fusion of Multimodal Color Medical Images Using Quaternion Principal Component Analysis
16
作者 Qamar Nawaz Xiao Bin +1 位作者 Li Weisheng Isma Hamid 《国际计算机前沿大会会议论文集》 2017年第2期94-96,共3页
Multimodal medical image fusion is used to merge functional and structural information of the same body organ. Most of the multimodal image fusion algorithms are designed to fuse grayscale images that are produced by ... Multimodal medical image fusion is used to merge functional and structural information of the same body organ. Most of the multimodal image fusion algorithms are designed to fuse grayscale images that are produced by different imaging modalities. It is likely that problems of colour distortion and information loss will occur in fused image when source images are fused by using algorithms that are not originally designed to fuse colour images. These problems can be avoided by representing and processing source images as quaternion numbers. Quaternion representation of a colour pixel encodes information of its colour channels on the imaginary parts of a quaternion number and provides the advantage to processing colour information holistically as a vector field. In this paper, we proposed an image fusion algorithm based on Quaternion Principal Component Analysis (QPCA), to fuse multimodal colour medical images.Quaternion principal components are calculated by decomposing quaternion covariance matrix using Quaternion Eigenvalue Decomposition (QEVD). Fusion rule is designed, based on the fusion weights that are extracted from the highly influential principal component. To test the performance of the proposed algorithm,experiments have been performed on six image-sets of multimodal colour images of the brain. Experimental results are compared objectively with existing image fusion algorithms. Comparison results show that the proposed algorithm performed better than existing algorithms in fusing colour medical images. 展开更多
关键词 QCPA based IMAGE fusion multimodal IMAGE fusion COLOUR IMAGE fusion MEDICAL IMAGE fusion
下载PDF
Fuzzy least brain storm optimization and entropy-based Euclidean distance for multimodal vein-based recognition system 被引量:1
17
作者 Dipti Verma Sipi Dubey 《Journal of Central South University》 SCIE EI CAS CSCD 2017年第10期2360-2371,共12页
Nowadays, the vein based recognition system becomes an emerging and facilitating biometric technology in the recognition system. Vein recognition exploits the different modalities such as finger, palm and hand image f... Nowadays, the vein based recognition system becomes an emerging and facilitating biometric technology in the recognition system. Vein recognition exploits the different modalities such as finger, palm and hand image for the person identification. In this work, the fuzzy least brain storm optimization and Euclidean distance(EED) are proposed for the vein based recognition system. Initially, the input image is fed into the region of interest(ROI) extraction which obtains the appropriate image for the subsequent step. Then, features or vein pattern is extracted by the image enlightening, circular averaging filter and holoentropy based thresholding. After the features are obtained, the entropy based Euclidean distance is proposed to fuse the features by the score level fusion with the weight score value. Finally, the optimal matching score is computed iteratively by the newly developed fuzzy least brain storm optimization(FLBSO) algorithm. The novel algorithm is developed by the least mean square(LMS) algorithm and fuzzy brain storm optimization(FBSO). Thus, the experimental results are evaluated and the performance is compared with the existing systems using false acceptance rate(FAR), false rejection rate(FRR) and accuracy. The performance outcome of the proposed algorithm attains the higher accuracy of 89.9% which ensures the better recognition rate. 展开更多
关键词 multimodalITY BRAIN STORM OPTIMIZATION (BSO) least mean square (LMS) score level fusion recognition
下载PDF
Multimodal spontaneous affect recognition using neural networks learned with hints
18
作者 张欣 吕坤 《Journal of Beijing Institute of Technology》 EI CAS 2014年第1期117-125,共9页
A multimodal fusion classifier is presented based on neural networks (NNs) learned with hints for automatic spontaneous affect recognition. In case that different channels can provide com- plementary information, fe... A multimodal fusion classifier is presented based on neural networks (NNs) learned with hints for automatic spontaneous affect recognition. In case that different channels can provide com- plementary information, features are utilized from four behavioral cues: frontal-view facial expres- sion, profile-view facial expression, shoulder movement, and vocalization (audio). NNs are used in both single cue processing and multimodal fusion. Coarse categories and quadrants in the activation- evaluation dimensional space are utilized respectively as the heuristic information (hints) of NNs during training, aiming at recognition of basic emotions. With the aid of hints, the weights in NNs could learn optimal feature groupings and the subtlety and complexity of spontaneous affective states could be better modeled. The proposed method requires low computation effort and reaches high recognition accuracy, even if the training data is insufficient. Experiment results on the Semaine nat- uralistic dataset demonstrate that our method is effective and promising. 展开更多
关键词 affect recognition multimodal fusion neural network learned with hints spontaneousaffect
下载PDF
Multimodal Spatiotemporal Feature Map for Dynamic Gesture Recognition
19
作者 Xiaorui Zhang Xianglong Zeng +2 位作者 Wei Sun Yongjun Ren Tong Xu 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期671-686,共16页
Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually us... Gesture recognition technology enables machines to read human gestures and has significant application prospects in the fields of human-computer interaction and sign language translation.Existing researches usually use convolutional neural networks to extract features directly from raw gesture data for gesture recognition,but the networks are affected by much interference information in the input data and thus fit to some unimportant features.In this paper,we proposed a novel method for encoding spatio-temporal information,which can enhance the key features required for gesture recognition,such as shape,structure,contour,position and hand motion of gestures,thereby improving the accuracy of gesture recognition.This encoding method can encode arbitrarily multiple frames of gesture data into a single frame of the spatio-temporal feature map and use the spatio-temporal feature map as the input to the neural network.This can guide the model to fit important features while avoiding the use of complex recurrent network structures to extract temporal features.In addition,we designed two sub-networks and trained the model using a sub-network pre-training strategy that trains the sub-networks first and then the entire network,so as to avoid the subnetworks focusing too much on the information of a single category feature and being overly influenced by each other’s features.Experimental results on two public gesture datasets show that the proposed spatio-temporal information encoding method achieves advanced accuracy. 展开更多
关键词 Dynamic gesture recognition spatio-temporal information encoding multimodal input pre-training score fusion
下载PDF
基于SEFusion-MPOR的多模态特征融合舆情表征算法
20
作者 郭小宇 马静 《情报理论与实践》 CSSCI 北大核心 2024年第7期181-189,共9页
[目的/意义]多模态舆情表征是多模态舆情计算与分析的基础。文章探索了一种赋予不同模态特征动态权重的舆情表征算法,可以更精准地捕捉到模态之间的依赖关系,极大降低多模态舆情表征复杂度,减少算力资源消耗。[方法/过程]SEFusion-MPOR... [目的/意义]多模态舆情表征是多模态舆情计算与分析的基础。文章探索了一种赋予不同模态特征动态权重的舆情表征算法,可以更精准地捕捉到模态之间的依赖关系,极大降低多模态舆情表征复杂度,减少算力资源消耗。[方法/过程]SEFusion-MPOR算法在预训练模型特征的基础上,通过全连接层、门控机制与激活函数构建了压缩与激活算子,获取各模态的动态权重,使用矩阵相乘将动态权重作用于相应模态,进而构建了多模态特征融合的网络舆情表征算法。[结果/结论]在Memotion 3与MVSA-multiple两个公开的多模态舆情数据集上进行实验,与基线模型的对比表明,文章提出的表征方法在多个子任务中取得了最优结果。该方法仅通过简单操作,就达到了复杂表征算法的效果,且具有可解释性与外推性。其高效和准确的表征方法不仅适用于舆情情报处理,也适合情报分析工作中的通用多模态信息基础表征。[局限]研究验证仅限于双模态数据集,未涉及更广泛模态的数据集。 展开更多
关键词 多模态舆情 多模态特征融合 舆情表征 预训练模型 SEfusion-MPOR
下载PDF
上一页 1 2 35 下一页 到第
使用帮助 返回顶部