At present,most experimental teaching systems lack guidance of an operator,and thus users often do not know what to do during an experiment.The user load is therefore increased,and the learning efficiency of the stude...At present,most experimental teaching systems lack guidance of an operator,and thus users often do not know what to do during an experiment.The user load is therefore increased,and the learning efficiency of the students is decreased.To solve the problem of insufficient system interactivity and guidance,an experimental navigation system based on multi-mode fusion is proposed in this paper.The system first obtains user information by sensing the hardware devices,intelligently perceives the user intention and progress of the experiment according to the information acquired,and finally carries out a multi-modal intelligent navigation process for users.As an innovative aspect of this study,an intelligent multi-mode navigation system is used to guide users in conducting experiments,thereby reducing the user load and enabling the users to effectively complete their experiments.The results prove that this system can guide users in completing their experiments,and can effectively reduce the user load during the interaction process and improve the efficiency.展开更多
Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant resear...Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities.Under complex scenes,multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions.However,achieving outstanding performance is challenging because of equipment performance limitations,missing information,and data noise.This paper comprehensively reviews existing methods based onmulti-modal fusion techniques and completes a detailed and in-depth analysis.According to the data fusion stage,multi-modal fusion has four primary methods:early fusion,deep fusion,late fusion,and hybrid fusion.The paper surveys the three majormulti-modal fusion technologies that can significantly enhance the effect of data fusion and further explore the applications of multi-modal fusion technology in various fields.Finally,it discusses the challenges and explores potential research opportunities.Multi-modal tasks still need intensive study because of data heterogeneity and quality.Preserving complementary information and eliminating redundant information between modalities is critical in multi-modal technology.Invalid data fusion methods may introduce extra noise and lead to worse results.This paper provides a comprehensive and detailed summary in response to these challenges.展开更多
In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure in...In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical.Unfortunately,existing approaches fail to handle these problems.This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues(TD-MMC),which utilizes three valuable multi-model clues:text-model importance,text-image complementary,and text-image inconsistency.TD-MMC is dominated by textural content and assisted by image information while using social network information to enhance text representation.To reduce the irrelevant social structure’s information interference,we use a unidirectional cross-modal attention mechanism to selectively learn the social structure’s features.A cross-modal attention mechanism is adopted to obtain text-image cross-modal features while retaining textual features to reduce the loss of important information.In addition,TD-MMC employs a new multi-model loss to improve the model’s generalization ability.Extensive experiments have been conducted on two public real-world English and Chinese datasets,and the results show that our proposed model outperforms the state-of-the-art methods on classification evaluation metrics.展开更多
Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and ...Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and recognition performance of the system can be enhanced through judiciously leveraging the correlation among multimodal features.Nevertheless,two issues persist in multi-modal feature fusion recognition:Firstly,the enhancement of recognition performance in fusion recognition has not comprehensively considered the inter-modality correlations among distinct modalities.Secondly,during modal fusion,improper weight selection diminishes the salience of crucial modal features,thereby diminishing the overall recognition performance.To address these two issues,we introduce an enhanced DenseNet multimodal recognition network founded on feature-level fusion.The information from the three modalities is fused akin to RGB,and the input network augments the correlation between modes through channel correlation.Within the enhanced DenseNet network,the Efficient Channel Attention Network(ECA-Net)dynamically adjusts the weight of each channel to amplify the salience of crucial information in each modal feature.Depthwise separable convolution markedly reduces the training parameters and further enhances the feature correlation.Experimental evaluations were conducted on four multimodal databases,comprising six unimodal databases,including multispectral palmprint and palm vein databases from the Chinese Academy of Sciences.The Equal Error Rates(EER)values were 0.0149%,0.0150%,0.0099%,and 0.0050%,correspondingly.In comparison to other network methods for palmprint,palm vein,and finger vein fusion recognition,this approach substantially enhances recognition performance,rendering it suitable for high-security environments with practical applicability.The experiments in this article utilized amodest sample database comprising 200 individuals.The subsequent phase involves preparing for the extension of the method to larger databases.展开更多
The all-wheel drive(AWD)hybrid system is a research focus on high-performance new energy vehicles that can meet the demands of dynamic performance and passing ability.Simultaneous optimization of the power and economy...The all-wheel drive(AWD)hybrid system is a research focus on high-performance new energy vehicles that can meet the demands of dynamic performance and passing ability.Simultaneous optimization of the power and economy of hybrid vehicles becomes an issue.A unique multi-mode coupling(MMC)AWD hybrid system is presented to realize the distributed and centralized driving of the front and rear axles to achieve vectored distribution and full utilization of the system power between the axles of vehicles.Based on the parameters of the benchmarking model of a hybrid vehicle,the best model-predictive control-based energy management strategy is proposed.First,the drive system model was built after the analysis of the MMC-AWD’s drive modes.Next,three fundamental strategies were established to address power distribution adjustment and battery SOC maintenance when the SOC changed,which was followed by the design of a road driving force observer.Then,the energy consumption rate in the average time domain was processed before designing the minimum fuel consumption controller based on the equivalent fuel consumption coefficient.Finally,the advantage of the MMC-AWD was confirmed by comparison with the dynamic performance and economy of the BYD Song PLUS DMI-AWD.The findings indicate that,in comparison to the comparative hybrid system at road adhesion coefficients of 0.8 and 0.6,the MMC-AWD’s capacity to accelerate increases by 5.26%and 7.92%,respectively.When the road adhesion coefficient is 0.8,0.6,and 0.4,the maximum climbing ability increases by 14.22%,12.88%,and 4.55%,respectively.As a result,the dynamic performance is greatly enhanced,and the fuel savings rate per 100 km of mileage reaches 12.06%,which is also very economical.The proposed control strategies for the new hybrid AWD vehicle can optimize the power and economy simultaneously.展开更多
Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction mode...Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.展开更多
To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features e...To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features extracted synchronously by the CCAE were stacked and fed to the multi-channel convolution layers for fusion. Then, the fused data was passed to all connection layers for compression and fed to the Softmax module for classification. Finally, the coupling loss function coefficients and the network parameters were optimized through an adaptive approach using the gray wolf optimization (GWO) algorithm. Experimental comparisons showed that the proposed ADCCAE fusion model was superior to existing models for multi-mode data fusion.展开更多
Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively h...Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.展开更多
Multimodal medical image fusion can help physicians provide more accurate treatment plans for patients, as unimodal images provide limited valid information. To address the insufficient ability of traditional medical ...Multimodal medical image fusion can help physicians provide more accurate treatment plans for patients, as unimodal images provide limited valid information. To address the insufficient ability of traditional medical image fusion solutions to protect image details and significant information, a new multimodality medical image fusion method(NSST-PAPCNNLatLRR) is proposed in this paper. Firstly, the high and low-frequency sub-band coefficients are obtained by decomposing the source image using NSST. Then, the latent low-rank representation algorithm is used to process the low-frequency sub-band coefficients;An improved PAPCNN algorithm is also proposed for the fusion of high-frequency sub-band coefficients. The improved PAPCNN model was based on the automatic setting of the parameters, and the optimal method was configured for the time decay factor αe. The experimental results show that, in comparison with the five mainstream fusion algorithms, the new algorithm has significantly improved the visual effect over the comparison algorithm,enhanced the ability to characterize important information in images, and further improved the ability to protect the detailed information;the new algorithm has achieved at least four firsts in six objective indexes.展开更多
Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and ...Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and malicious detection,lacking the malicious Power Shell families classification and behavior analysis.Moreover,the state-of-the-art methods fail to capture fine-grained features and semantic relationships,resulting in low robustness and accuracy.To this end,we propose Power Detector,a novel malicious Power Shell script detector based on multimodal semantic fusion and deep learning.Specifically,we design four feature extraction methods to extract key features from character,token,abstract syntax tree(AST),and semantic knowledge graph.Then,we intelligently design four embeddings(i.e.,Char2Vec,Token2Vec,AST2Vec,and Rela2Vec) and construct a multi-modal fusion algorithm to concatenate feature vectors from different views.Finally,we propose a combined model based on transformer and CNN-Bi LSTM to implement Power Shell family detection.Our experiments with five types of Power Shell attacks show that PowerDetector can accurately detect various obfuscated and stealth PowerShell scripts,with a 0.9402 precision,a 0.9358 recall,and a 0.9374 F1-score.Furthermore,through singlemodal and multi-modal comparison experiments,we demonstrate that PowerDetector’s multi-modal embedding and deep learning model can achieve better accuracy and even identify more unknown attacks.展开更多
In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resoluti...In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resolution,single viewpoint,and occlusion.Different from the existing works predicting symmetry from the complete shape,we propose a learning approach for symmetry predic-tion based on a single RGB-D image.Instead of directly predicting the symmetry from incomplete shapes,our method consists of two modules,i.e.,the multi-mod-al feature fusion module and the detection-by-reconstruction module.Firstly,we build a channel-transformer network(CTN)to extract cross-fusion features from the RGB-D as the multi-modal feature fusion module,which helps us aggregate features from the color and the depth separately.Then,our self-reconstruction net-work based on a 3D variational auto-encoder(3D-VAE)takes the global geo-metric features as input,followed by a prediction symmetry network to detect the symmetry.Our experiments are conducted on three public datasets:ShapeNet,YCB,and ScanNet,we demonstrate that our method can produce reliable and accurate results.展开更多
针对自动驾驶路面上目标漏检和错检的问题,提出一种基于改进Centerfusion的自动驾驶3D目标检测模型。该模型通过将相机信息和雷达特征融合,构成多通道特征数据输入,从而增强目标检测网络的鲁棒性,减少漏检问题;为了能够得到更加准确丰富...针对自动驾驶路面上目标漏检和错检的问题,提出一种基于改进Centerfusion的自动驾驶3D目标检测模型。该模型通过将相机信息和雷达特征融合,构成多通道特征数据输入,从而增强目标检测网络的鲁棒性,减少漏检问题;为了能够得到更加准确丰富的3D目标检测信息,引入了改进的注意力机制,用于增强视锥网格中的雷达点云和视觉信息融合;使用改进的损失函数优化边框预测的准确度。在Nuscenes数据集上进行模型验证和对比,实验结果表明,相较于传统的Centerfusion模型,提出的模型平均检测精度均值(mean Average Precision,mAP)提高了1.3%,Nuscenes检测分数(Nuscenes Detection Scores,NDS)提高了1.2%。展开更多
Event extraction stands as a significant endeavor within the realm of information extraction,aspiring to automatically extract structured event information from vast volumes of unstructured text.Extracting event eleme...Event extraction stands as a significant endeavor within the realm of information extraction,aspiring to automatically extract structured event information from vast volumes of unstructured text.Extracting event elements from multi-modal data remains a challenging task due to the presence of a large number of images and overlapping event elements in the data.Although researchers have proposed various methods to accomplish this task,most existing event extraction models cannot address these challenges because they are only applicable to text scenarios.To solve the above issues,this paper proposes a multi-modal event extraction method based on knowledge fusion.Specifically,for event-type recognition,we use a meticulous pipeline approach that integrates multiple pre-trained models.This approach enables a more comprehensive capture of the multidimensional event semantic features present in military texts,thereby enhancing the interconnectedness of information between trigger words and events.For event element extraction,we propose a method for constructing a priori templates that combine event types with corresponding trigger words.This approach facilitates the acquisition of fine-grained input samples containing event trigger words,thus enabling the model to understand the semantic relationships between elements in greater depth.Furthermore,a fusion method for spatial mapping of textual event elements and image elements is proposed to reduce the category number overload and effectively achieve multi-modal knowledge fusion.The experimental results based on the CCKS 2022 dataset show that our method has achieved competitive results,with a comprehensive evaluation value F1-score of 53.4%for the model.These results validate the effectiveness of our method in extracting event elements from multi-modal data.展开更多
Magnesium(Mg)alloys are considered to be a new generation of revolutionary medical metals.Laser-beam powder bed fusion(PBF-LB)is suitable for fabricating metal implants withpersonalized and complicated structures.Howe...Magnesium(Mg)alloys are considered to be a new generation of revolutionary medical metals.Laser-beam powder bed fusion(PBF-LB)is suitable for fabricating metal implants withpersonalized and complicated structures.However,the as-built part usually exhibits undesirable microstructure and unsatisfactory performance.In this work,WE43 parts were firstly fabricated by PBF-LB and then subjected to heat treatment.Although a high densification rate of 99.91%was achieved using suitable processes,the as-built parts exhibited anisotropic and layeredmicrostructure with heterogeneously precipitated Nd-rich intermetallic.After heat treatment,fine and nano-scaled Mg24Y5particles were precipitated.Meanwhile,theα-Mg grainsunderwent recrystallization and turned coarsened slightly,which effectively weakened thetexture intensity and reduced the anisotropy.As a consequence,the yield strength and ultimate tensile strength were significantly improved to(250.2±3.5)MPa and(312±3.7)MPa,respectively,while the elongation was still maintained at a high level of 15.2%.Furthermore,the homogenized microstructure reduced the tendency of localized corrosion and favoredthe development of uniform passivation film.Thus,the degradation rate of WE43 parts was decreased by an order of magnitude.Besides,in-vitro cell experiments proved their favorable biocompatibility.展开更多
In mobile machinery,hydro-mechanical pumps are increasingly replaced by electronically controlled pumps to improve the automation level,but diversified control functions(e.g.,power limitation and pressure cut-off)are ...In mobile machinery,hydro-mechanical pumps are increasingly replaced by electronically controlled pumps to improve the automation level,but diversified control functions(e.g.,power limitation and pressure cut-off)are integrated into the electronic controller only from the pump level,leading to the potential instability of the overall system.To solve this problem,a multi-mode electrohydraulic load sensing(MELS)control scheme is proposed especially considering the switching stability from the system level,which includes four working modes of flow control,load sensing,power limitation,and pressure control.Depending on the actual working requirements,the switching rules for the different modes and the switching direction(i.e.,the modes can be switched bilaterally or unilaterally)are defined.The priority of different modes is also defined,from high to low:pressure control,power limitation,load sensing,and flow control.When multiple switching rules are satisfied at the same time,the system switches to the control mode with the highest priority.In addition,the switching stability between flow control and pressure control modes is analyzed,and the controller parameters that guarantee the switching stability are obtained.A comparative study is carried out based on a test rig with a 2-ton hydraulic excavator.The results show that the MELS controller can achieve the control functions of proper flow supplement,power limitation,and pressure cut-off,which has good stability performance when switching between different control modes.This research proposes the MELS control method that realizes the stability of multi-mode switching of the hydraulic system of mobile machinery under different working conditions.展开更多
Laser powder bed fusion(L-PBF)of Mg alloys has provided tremendous opportunities for customized production of aeronautical and medical parts.Layer thickness(LT)is of great significance to the L-PBF process but has not...Laser powder bed fusion(L-PBF)of Mg alloys has provided tremendous opportunities for customized production of aeronautical and medical parts.Layer thickness(LT)is of great significance to the L-PBF process but has not been studied for Mg alloys.In this study,WE43 Mg alloy bulk cubes,porous scaffolds,and thin walls with layer thicknesses of 10,20,30,and 40μm were fabricated.The required laser energy input increased with increasing layer thickness and was different for the bulk cubes and porous scaffolds.Porosity tended to occur at the connection joints in porous scaffolds for LT40 and could be eliminated by reducing the laser energy input.For thin wall parts,a large overhang angle or a small wall thickness resulted in porosity when a large layer thicknesses was used,and the porosity disappeared by reducing the layer thickness or laser energy input.A deeper keyhole penetration was found in all occasions with porosity,explaining the influence of layer thickness,geometrical structure,and laser energy input on the porosity.All the samples achieved a high fusion quality with a relative density of over 99.5%using the optimized laser energy input.The increased layer thickness resulted to more precipitation phases,finer grain sizes and decreased grain texture.With the similar high fusion quality,the tensile strength and elongation of bulk samples were significantly improved from 257 MPa and 1.41%with the 10μm layer to 287 MPa and 15.12%with the 40μm layer,in accordance with the microstructural change.The effect of layer thickness on the compressive properties of porous scaffolds was limited.However,the corrosion rate of bulk samples accelerated with increasing the layer thickness,mainly attributed to the increased number of precipitation phases.展开更多
Hepatocellular carcinoma (HCC) is one of the most common tumor types and remains a major clinical challenge. Increasing evidence has revealed that mitophagy inhibitors can enhance the effect of chemotherapy on HCC. Ho...Hepatocellular carcinoma (HCC) is one of the most common tumor types and remains a major clinical challenge. Increasing evidence has revealed that mitophagy inhibitors can enhance the effect of chemotherapy on HCC. However, few mitophagy inhibitors have been approved for clinical use in humans. Pyrimethamine (Pyr) is used to treat infections caused by protozoan parasites. Recent studies have reported that Pyr may be beneficial in the treatment of various tumors. However, its mechanism of action is still not clearly defined. Here, we found that blocking mitophagy sensitized cells to Pyr-induced apoptosis. Mechanistically, Pyr potently induced the accumulation of autophagosomes by inhibiting autophagosome-lysosome fusion in human HCC cells. In vitro and in vivo studies revealed that Pyr blocked autophagosome-lysosome fusion by upregulating BNIP3 to inhibit synaptosomal-associated protein 29 (SNAP29)-vesicle-associated membrane protein 8 (VAMP8) interaction. Moreover, Pyr acted synergistically with sorafenib (Sora) to induce apoptosis and inhibit HCC proliferation in vitro and in vivo. Pyr enhances the sensitivity of HCC cells to Sora, a common chemotherapeutic, by inhibiting mitophagy. Thus, these results provide new insights into the mechanism of action of Pyr and imply that Pyr could potentially be further developed as a novel mitophagy inhibitor. Notably, Pyr and Sora combination therapy could be a promising treatment for malignant HCC.展开更多
Metal additive manufacturing(AM)has been extensively studied in recent decades.Despite the significant progress achieved in manufacturing complex shapes and structures,challenges such as severe cracking when using exi...Metal additive manufacturing(AM)has been extensively studied in recent decades.Despite the significant progress achieved in manufacturing complex shapes and structures,challenges such as severe cracking when using existing alloys for laser powder bed fusion(L-PBF)AM have persisted.These challenges arise because commercial alloys are primarily designed for conventional casting or forging processes,overlooking the fast cooling rates,steep temperature gradients and multiple thermal cycles of L-PBF.To address this,there is an urgent need to develop novel alloys specifically tailored for L-PBF technologies.This review provides a comprehensive summary of the strategies employed in alloy design for L-PBF.It aims to guide future research on designing novel alloys dedicated to L-PBF instead of adapting existing alloys.The review begins by discussing the features of the L-PBF processes,focusing on rapid solidification and intrinsic heat treatment.Next,the printability of the four main existing alloys(Fe-,Ni-,Al-and Ti-based alloys)is critically assessed,with a comparison of their conventional weldability.It was found that the weldability criteria are not always applicable in estimating printability.Furthermore,the review presents recent advances in alloy development and associated strategies,categorizing them into crack mitigation-oriented,microstructure manipulation-oriented and machine learning-assisted approaches.Lastly,an outlook and suggestions are given to highlight the issues that need to be addressed in future work.展开更多
A novel image fusion network framework with an autonomous encoder and decoder is suggested to increase thevisual impression of fused images by improving the quality of infrared and visible light picture fusion. The ne...A novel image fusion network framework with an autonomous encoder and decoder is suggested to increase thevisual impression of fused images by improving the quality of infrared and visible light picture fusion. The networkcomprises an encoder module, fusion layer, decoder module, and edge improvementmodule. The encoder moduleutilizes an enhanced Inception module for shallow feature extraction, then combines Res2Net and Transformerto achieve deep-level co-extraction of local and global features from the original picture. An edge enhancementmodule (EEM) is created to extract significant edge features. A modal maximum difference fusion strategy isintroduced to enhance the adaptive representation of information in various regions of the source image, therebyenhancing the contrast of the fused image. The encoder and the EEM module extract features, which are thencombined in the fusion layer to create a fused picture using the decoder. Three datasets were chosen to test thealgorithmproposed in this paper. The results of the experiments demonstrate that the network effectively preservesbackground and detail information in both infrared and visible images, yielding superior outcomes in subjectiveand objective evaluations.展开更多
基金the the National Key R&D Program of China(No.2018YFB1004901)the Independent Innovation Team Project of Jinan City(No.2019GXRC013).
文摘At present,most experimental teaching systems lack guidance of an operator,and thus users often do not know what to do during an experiment.The user load is therefore increased,and the learning efficiency of the students is decreased.To solve the problem of insufficient system interactivity and guidance,an experimental navigation system based on multi-mode fusion is proposed in this paper.The system first obtains user information by sensing the hardware devices,intelligently perceives the user intention and progress of the experiment according to the information acquired,and finally carries out a multi-modal intelligent navigation process for users.As an innovative aspect of this study,an intelligent multi-mode navigation system is used to guide users in conducting experiments,thereby reducing the user load and enabling the users to effectively complete their experiments.The results prove that this system can guide users in completing their experiments,and can effectively reduce the user load during the interaction process and improve the efficiency.
基金supported by the Natural Science Foundation of Liaoning Province(Grant No.2023-MSBA-070)the National Natural Science Foundation of China(Grant No.62302086).
文摘Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities.Under complex scenes,multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions.However,achieving outstanding performance is challenging because of equipment performance limitations,missing information,and data noise.This paper comprehensively reviews existing methods based onmulti-modal fusion techniques and completes a detailed and in-depth analysis.According to the data fusion stage,multi-modal fusion has four primary methods:early fusion,deep fusion,late fusion,and hybrid fusion.The paper surveys the three majormulti-modal fusion technologies that can significantly enhance the effect of data fusion and further explore the applications of multi-modal fusion technology in various fields.Finally,it discusses the challenges and explores potential research opportunities.Multi-modal tasks still need intensive study because of data heterogeneity and quality.Preserving complementary information and eliminating redundant information between modalities is critical in multi-modal technology.Invalid data fusion methods may introduce extra noise and lead to worse results.This paper provides a comprehensive and detailed summary in response to these challenges.
基金This research was funded by the General Project of Philosophy and Social Science of Heilongjiang Province,Grant Number:20SHB080.
文摘In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical.Unfortunately,existing approaches fail to handle these problems.This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues(TD-MMC),which utilizes three valuable multi-model clues:text-model importance,text-image complementary,and text-image inconsistency.TD-MMC is dominated by textural content and assisted by image information while using social network information to enhance text representation.To reduce the irrelevant social structure’s information interference,we use a unidirectional cross-modal attention mechanism to selectively learn the social structure’s features.A cross-modal attention mechanism is adopted to obtain text-image cross-modal features while retaining textual features to reduce the loss of important information.In addition,TD-MMC employs a new multi-model loss to improve the model’s generalization ability.Extensive experiments have been conducted on two public real-world English and Chinese datasets,and the results show that our proposed model outperforms the state-of-the-art methods on classification evaluation metrics.
基金funded by the National Natural Science Foundation of China(61991413)the China Postdoctoral Science Foundation(2019M651142)+1 种基金the Natural Science Foundation of Liaoning Province(2021-KF-12-07)the Natural Science Foundations of Liaoning Province(2023-MS-322).
文摘Fusing hand-based features in multi-modal biometric recognition enhances anti-spoofing capabilities.Additionally,it leverages inter-modal correlation to enhance recognition performance.Concurrently,the robustness and recognition performance of the system can be enhanced through judiciously leveraging the correlation among multimodal features.Nevertheless,two issues persist in multi-modal feature fusion recognition:Firstly,the enhancement of recognition performance in fusion recognition has not comprehensively considered the inter-modality correlations among distinct modalities.Secondly,during modal fusion,improper weight selection diminishes the salience of crucial modal features,thereby diminishing the overall recognition performance.To address these two issues,we introduce an enhanced DenseNet multimodal recognition network founded on feature-level fusion.The information from the three modalities is fused akin to RGB,and the input network augments the correlation between modes through channel correlation.Within the enhanced DenseNet network,the Efficient Channel Attention Network(ECA-Net)dynamically adjusts the weight of each channel to amplify the salience of crucial information in each modal feature.Depthwise separable convolution markedly reduces the training parameters and further enhances the feature correlation.Experimental evaluations were conducted on four multimodal databases,comprising six unimodal databases,including multispectral palmprint and palm vein databases from the Chinese Academy of Sciences.The Equal Error Rates(EER)values were 0.0149%,0.0150%,0.0099%,and 0.0050%,correspondingly.In comparison to other network methods for palmprint,palm vein,and finger vein fusion recognition,this approach substantially enhances recognition performance,rendering it suitable for high-security environments with practical applicability.The experiments in this article utilized amodest sample database comprising 200 individuals.The subsequent phase involves preparing for the extension of the method to larger databases.
基金Supported by Hebei Provincial Natural Science Foundation of China(Grant Nos.E2020203174,E2020203078)S&T Program of Hebei Province of China(Grant No.226Z2202G)Science Research Project of Hebei Provincial Education Department of China(Grant No.ZD2022029).
文摘The all-wheel drive(AWD)hybrid system is a research focus on high-performance new energy vehicles that can meet the demands of dynamic performance and passing ability.Simultaneous optimization of the power and economy of hybrid vehicles becomes an issue.A unique multi-mode coupling(MMC)AWD hybrid system is presented to realize the distributed and centralized driving of the front and rear axles to achieve vectored distribution and full utilization of the system power between the axles of vehicles.Based on the parameters of the benchmarking model of a hybrid vehicle,the best model-predictive control-based energy management strategy is proposed.First,the drive system model was built after the analysis of the MMC-AWD’s drive modes.Next,three fundamental strategies were established to address power distribution adjustment and battery SOC maintenance when the SOC changed,which was followed by the design of a road driving force observer.Then,the energy consumption rate in the average time domain was processed before designing the minimum fuel consumption controller based on the equivalent fuel consumption coefficient.Finally,the advantage of the MMC-AWD was confirmed by comparison with the dynamic performance and economy of the BYD Song PLUS DMI-AWD.The findings indicate that,in comparison to the comparative hybrid system at road adhesion coefficients of 0.8 and 0.6,the MMC-AWD’s capacity to accelerate increases by 5.26%and 7.92%,respectively.When the road adhesion coefficient is 0.8,0.6,and 0.4,the maximum climbing ability increases by 14.22%,12.88%,and 4.55%,respectively.As a result,the dynamic performance is greatly enhanced,and the fuel savings rate per 100 km of mileage reaches 12.06%,which is also very economical.The proposed control strategies for the new hybrid AWD vehicle can optimize the power and economy simultaneously.
基金Project(2023JH26-10100002)supported by the Liaoning Science and Technology Major Project,ChinaProjects(U21A20117,52074085)supported by the National Natural Science Foundation of China+1 种基金Project(2022JH2/101300008)supported by the Liaoning Applied Basic Research Program Project,ChinaProject(22567612H)supported by the Hebei Provincial Key Laboratory Performance Subsidy Project,China。
文摘Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.
文摘To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features extracted synchronously by the CCAE were stacked and fed to the multi-channel convolution layers for fusion. Then, the fused data was passed to all connection layers for compression and fed to the Softmax module for classification. Finally, the coupling loss function coefficients and the network parameters were optimized through an adaptive approach using the gray wolf optimization (GWO) algorithm. Experimental comparisons showed that the proposed ADCCAE fusion model was superior to existing models for multi-mode data fusion.
文摘Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.
基金funded by the National Natural Science Foundation of China,grant number 61302188.
文摘Multimodal medical image fusion can help physicians provide more accurate treatment plans for patients, as unimodal images provide limited valid information. To address the insufficient ability of traditional medical image fusion solutions to protect image details and significant information, a new multimodality medical image fusion method(NSST-PAPCNNLatLRR) is proposed in this paper. Firstly, the high and low-frequency sub-band coefficients are obtained by decomposing the source image using NSST. Then, the latent low-rank representation algorithm is used to process the low-frequency sub-band coefficients;An improved PAPCNN algorithm is also proposed for the fusion of high-frequency sub-band coefficients. The improved PAPCNN model was based on the automatic setting of the parameters, and the optimal method was configured for the time decay factor αe. The experimental results show that, in comparison with the five mainstream fusion algorithms, the new algorithm has significantly improved the visual effect over the comparison algorithm,enhanced the ability to characterize important information in images, and further improved the ability to protect the detailed information;the new algorithm has achieved at least four firsts in six objective indexes.
基金This work was supported by National Natural Science Foundation of China(No.62172308,No.U1626107,No.61972297,No.62172144,and No.62062019).
文摘Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and malicious detection,lacking the malicious Power Shell families classification and behavior analysis.Moreover,the state-of-the-art methods fail to capture fine-grained features and semantic relationships,resulting in low robustness and accuracy.To this end,we propose Power Detector,a novel malicious Power Shell script detector based on multimodal semantic fusion and deep learning.Specifically,we design four feature extraction methods to extract key features from character,token,abstract syntax tree(AST),and semantic knowledge graph.Then,we intelligently design four embeddings(i.e.,Char2Vec,Token2Vec,AST2Vec,and Rela2Vec) and construct a multi-modal fusion algorithm to concatenate feature vectors from different views.Finally,we propose a combined model based on transformer and CNN-Bi LSTM to implement Power Shell family detection.Our experiments with five types of Power Shell attacks show that PowerDetector can accurately detect various obfuscated and stealth PowerShell scripts,with a 0.9402 precision,a 0.9358 recall,and a 0.9374 F1-score.Furthermore,through singlemodal and multi-modal comparison experiments,we demonstrate that PowerDetector’s multi-modal embedding and deep learning model can achieve better accuracy and even identify more unknown attacks.
文摘In geometry processing,symmetry research benefits from global geo-metric features of complete shapes,but the shape of an object captured in real-world applications is often incomplete due to the limited sensor resolution,single viewpoint,and occlusion.Different from the existing works predicting symmetry from the complete shape,we propose a learning approach for symmetry predic-tion based on a single RGB-D image.Instead of directly predicting the symmetry from incomplete shapes,our method consists of two modules,i.e.,the multi-mod-al feature fusion module and the detection-by-reconstruction module.Firstly,we build a channel-transformer network(CTN)to extract cross-fusion features from the RGB-D as the multi-modal feature fusion module,which helps us aggregate features from the color and the depth separately.Then,our self-reconstruction net-work based on a 3D variational auto-encoder(3D-VAE)takes the global geo-metric features as input,followed by a prediction symmetry network to detect the symmetry.Our experiments are conducted on three public datasets:ShapeNet,YCB,and ScanNet,we demonstrate that our method can produce reliable and accurate results.
文摘针对自动驾驶路面上目标漏检和错检的问题,提出一种基于改进Centerfusion的自动驾驶3D目标检测模型。该模型通过将相机信息和雷达特征融合,构成多通道特征数据输入,从而增强目标检测网络的鲁棒性,减少漏检问题;为了能够得到更加准确丰富的3D目标检测信息,引入了改进的注意力机制,用于增强视锥网格中的雷达点云和视觉信息融合;使用改进的损失函数优化边框预测的准确度。在Nuscenes数据集上进行模型验证和对比,实验结果表明,相较于传统的Centerfusion模型,提出的模型平均检测精度均值(mean Average Precision,mAP)提高了1.3%,Nuscenes检测分数(Nuscenes Detection Scores,NDS)提高了1.2%。
基金supported by the National Natural Science Foundation of China(Grant No.81973695)Discipline with Strong Characteristics of Liaocheng University-Intelligent Science and Technology(Grant No.319462208).
文摘Event extraction stands as a significant endeavor within the realm of information extraction,aspiring to automatically extract structured event information from vast volumes of unstructured text.Extracting event elements from multi-modal data remains a challenging task due to the presence of a large number of images and overlapping event elements in the data.Although researchers have proposed various methods to accomplish this task,most existing event extraction models cannot address these challenges because they are only applicable to text scenarios.To solve the above issues,this paper proposes a multi-modal event extraction method based on knowledge fusion.Specifically,for event-type recognition,we use a meticulous pipeline approach that integrates multiple pre-trained models.This approach enables a more comprehensive capture of the multidimensional event semantic features present in military texts,thereby enhancing the interconnectedness of information between trigger words and events.For event element extraction,we propose a method for constructing a priori templates that combine event types with corresponding trigger words.This approach facilitates the acquisition of fine-grained input samples containing event trigger words,thus enabling the model to understand the semantic relationships between elements in greater depth.Furthermore,a fusion method for spatial mapping of textual event elements and image elements is proposed to reduce the category number overload and effectively achieve multi-modal knowledge fusion.The experimental results based on the CCKS 2022 dataset show that our method has achieved competitive results,with a comprehensive evaluation value F1-score of 53.4%for the model.These results validate the effectiveness of our method in extracting event elements from multi-modal data.
基金supported by the following funds:National Natural Science Foundation of China(51935014,52165043)Jiangxi Provincial Cultivation Program for Academic and Technical Leaders of Major Subjects(20225BCJ23008)+1 种基金Jiangxi Provincial Natural Science Foundation(20224ACB204013,20224ACB214008)Scientific Research Project of Anhui Universities(KJ2021A1106)。
文摘Magnesium(Mg)alloys are considered to be a new generation of revolutionary medical metals.Laser-beam powder bed fusion(PBF-LB)is suitable for fabricating metal implants withpersonalized and complicated structures.However,the as-built part usually exhibits undesirable microstructure and unsatisfactory performance.In this work,WE43 parts were firstly fabricated by PBF-LB and then subjected to heat treatment.Although a high densification rate of 99.91%was achieved using suitable processes,the as-built parts exhibited anisotropic and layeredmicrostructure with heterogeneously precipitated Nd-rich intermetallic.After heat treatment,fine and nano-scaled Mg24Y5particles were precipitated.Meanwhile,theα-Mg grainsunderwent recrystallization and turned coarsened slightly,which effectively weakened thetexture intensity and reduced the anisotropy.As a consequence,the yield strength and ultimate tensile strength were significantly improved to(250.2±3.5)MPa and(312±3.7)MPa,respectively,while the elongation was still maintained at a high level of 15.2%.Furthermore,the homogenized microstructure reduced the tendency of localized corrosion and favoredthe development of uniform passivation film.Thus,the degradation rate of WE43 parts was decreased by an order of magnitude.Besides,in-vitro cell experiments proved their favorable biocompatibility.
基金National Key Research and Development Program of China(Grant No.2020YFB2009702)National Natural Science Foundation of China(Grant Nos.52075055,U21A20124 and 52111530069)Chongqing Natural Science Foundation of China(Grant No.cstc2020jcyj-msxmX0780)。
文摘In mobile machinery,hydro-mechanical pumps are increasingly replaced by electronically controlled pumps to improve the automation level,but diversified control functions(e.g.,power limitation and pressure cut-off)are integrated into the electronic controller only from the pump level,leading to the potential instability of the overall system.To solve this problem,a multi-mode electrohydraulic load sensing(MELS)control scheme is proposed especially considering the switching stability from the system level,which includes four working modes of flow control,load sensing,power limitation,and pressure control.Depending on the actual working requirements,the switching rules for the different modes and the switching direction(i.e.,the modes can be switched bilaterally or unilaterally)are defined.The priority of different modes is also defined,from high to low:pressure control,power limitation,load sensing,and flow control.When multiple switching rules are satisfied at the same time,the system switches to the control mode with the highest priority.In addition,the switching stability between flow control and pressure control modes is analyzed,and the controller parameters that guarantee the switching stability are obtained.A comparative study is carried out based on a test rig with a 2-ton hydraulic excavator.The results show that the MELS controller can achieve the control functions of proper flow supplement,power limitation,and pressure cut-off,which has good stability performance when switching between different control modes.This research proposes the MELS control method that realizes the stability of multi-mode switching of the hydraulic system of mobile machinery under different working conditions.
基金funded by the National Key Research and Development Program of China(2018YFE0104200)National Natural Science Foundation of China(51875310,52175274,82172065)Tsinghua Precision Medicine Foundation.
文摘Laser powder bed fusion(L-PBF)of Mg alloys has provided tremendous opportunities for customized production of aeronautical and medical parts.Layer thickness(LT)is of great significance to the L-PBF process but has not been studied for Mg alloys.In this study,WE43 Mg alloy bulk cubes,porous scaffolds,and thin walls with layer thicknesses of 10,20,30,and 40μm were fabricated.The required laser energy input increased with increasing layer thickness and was different for the bulk cubes and porous scaffolds.Porosity tended to occur at the connection joints in porous scaffolds for LT40 and could be eliminated by reducing the laser energy input.For thin wall parts,a large overhang angle or a small wall thickness resulted in porosity when a large layer thicknesses was used,and the porosity disappeared by reducing the layer thickness or laser energy input.A deeper keyhole penetration was found in all occasions with porosity,explaining the influence of layer thickness,geometrical structure,and laser energy input on the porosity.All the samples achieved a high fusion quality with a relative density of over 99.5%using the optimized laser energy input.The increased layer thickness resulted to more precipitation phases,finer grain sizes and decreased grain texture.With the similar high fusion quality,the tensile strength and elongation of bulk samples were significantly improved from 257 MPa and 1.41%with the 10μm layer to 287 MPa and 15.12%with the 40μm layer,in accordance with the microstructural change.The effect of layer thickness on the compressive properties of porous scaffolds was limited.However,the corrosion rate of bulk samples accelerated with increasing the layer thickness,mainly attributed to the increased number of precipitation phases.
基金supported by the National Natural Science Foundation of China(Grant No:81903643)the“Young Talent Support Plan”of Xi'an Jiaotong University,the Shaanxi Province Science and Technology Development Plan Project(Grant No.:2022ZDLSF05-05)+1 种基金the Project of Shaanxi Provincial Administration of Traditional Chinese Medicine(Project No.:2021-03-ZZ-002)the Shaanxi Province Science Fund for Distinguished Young Scholars(Grant No:2023-JC-JQ-59).
文摘Hepatocellular carcinoma (HCC) is one of the most common tumor types and remains a major clinical challenge. Increasing evidence has revealed that mitophagy inhibitors can enhance the effect of chemotherapy on HCC. However, few mitophagy inhibitors have been approved for clinical use in humans. Pyrimethamine (Pyr) is used to treat infections caused by protozoan parasites. Recent studies have reported that Pyr may be beneficial in the treatment of various tumors. However, its mechanism of action is still not clearly defined. Here, we found that blocking mitophagy sensitized cells to Pyr-induced apoptosis. Mechanistically, Pyr potently induced the accumulation of autophagosomes by inhibiting autophagosome-lysosome fusion in human HCC cells. In vitro and in vivo studies revealed that Pyr blocked autophagosome-lysosome fusion by upregulating BNIP3 to inhibit synaptosomal-associated protein 29 (SNAP29)-vesicle-associated membrane protein 8 (VAMP8) interaction. Moreover, Pyr acted synergistically with sorafenib (Sora) to induce apoptosis and inhibit HCC proliferation in vitro and in vivo. Pyr enhances the sensitivity of HCC cells to Sora, a common chemotherapeutic, by inhibiting mitophagy. Thus, these results provide new insights into the mechanism of action of Pyr and imply that Pyr could potentially be further developed as a novel mitophagy inhibitor. Notably, Pyr and Sora combination therapy could be a promising treatment for malignant HCC.
基金financially supported by the National Key Research and Development Program of China(2022YFB4600302)National Natural Science Foundation of China(52090041)+1 种基金National Natural Science Foundation of China(52104368)National Major Science and Technology Projects of China(J2019-VII-0010-0150)。
文摘Metal additive manufacturing(AM)has been extensively studied in recent decades.Despite the significant progress achieved in manufacturing complex shapes and structures,challenges such as severe cracking when using existing alloys for laser powder bed fusion(L-PBF)AM have persisted.These challenges arise because commercial alloys are primarily designed for conventional casting or forging processes,overlooking the fast cooling rates,steep temperature gradients and multiple thermal cycles of L-PBF.To address this,there is an urgent need to develop novel alloys specifically tailored for L-PBF technologies.This review provides a comprehensive summary of the strategies employed in alloy design for L-PBF.It aims to guide future research on designing novel alloys dedicated to L-PBF instead of adapting existing alloys.The review begins by discussing the features of the L-PBF processes,focusing on rapid solidification and intrinsic heat treatment.Next,the printability of the four main existing alloys(Fe-,Ni-,Al-and Ti-based alloys)is critically assessed,with a comparison of their conventional weldability.It was found that the weldability criteria are not always applicable in estimating printability.Furthermore,the review presents recent advances in alloy development and associated strategies,categorizing them into crack mitigation-oriented,microstructure manipulation-oriented and machine learning-assisted approaches.Lastly,an outlook and suggestions are given to highlight the issues that need to be addressed in future work.
文摘A novel image fusion network framework with an autonomous encoder and decoder is suggested to increase thevisual impression of fused images by improving the quality of infrared and visible light picture fusion. The networkcomprises an encoder module, fusion layer, decoder module, and edge improvementmodule. The encoder moduleutilizes an enhanced Inception module for shallow feature extraction, then combines Res2Net and Transformerto achieve deep-level co-extraction of local and global features from the original picture. An edge enhancementmodule (EEM) is created to extract significant edge features. A modal maximum difference fusion strategy isintroduced to enhance the adaptive representation of information in various regions of the source image, therebyenhancing the contrast of the fused image. The encoder and the EEM module extract features, which are thencombined in the fusion layer to create a fused picture using the decoder. Three datasets were chosen to test thealgorithmproposed in this paper. The results of the experiments demonstrate that the network effectively preservesbackground and detail information in both infrared and visible images, yielding superior outcomes in subjectiveand objective evaluations.