Based on the CNN-LSTM fusion deep neural network,this paper proposes a seismic velocity model building method that can simultaneously estimate the root mean square(RMS)velocity and interval velocity from the common-mi...Based on the CNN-LSTM fusion deep neural network,this paper proposes a seismic velocity model building method that can simultaneously estimate the root mean square(RMS)velocity and interval velocity from the common-midpoint(CMP)gather.In the proposed method,a convolutional neural network(CNN)Encoder and two long short-term memory networks(LSTMs)are used to extract spatial and temporal features from seismic signals,respectively,and a CNN Decoder is used to recover RMS velocity and interval velocity of underground media from various feature vectors.To address the problems of unstable gradients and easily fall into a local minimum in the deep neural network training process,we propose to use Kaiming normal initialization with zero negative slopes of rectifi ed units and to adjust the network learning process by optimizing the mean square error(MSE)loss function with the introduction of a freezing factor.The experiments on testing dataset show that CNN-LSTM fusion deep neural network can predict RMS velocity as well as interval velocity more accurately,and its inversion accuracy is superior to that of single neural network models.The predictions on the complex structures and Marmousi model are consistent with the true velocity variation trends,and the predictions on fi eld data can eff ectively correct the phase axis,improve the lateral continuity of phase axis and quality of stack section,indicating the eff ectiveness and decent generalization capability of the proposed method.展开更多
The employment of deep convolutional neural networks has recently contributed to significant progress in single image super-resolution(SISR)research.However,the high computational demands of most SR techniques hinder ...The employment of deep convolutional neural networks has recently contributed to significant progress in single image super-resolution(SISR)research.However,the high computational demands of most SR techniques hinder their applicability to edge devices,despite their satisfactory reconstruction performance.These methods commonly use standard convolutions,which increase the convolutional operation cost of the model.In this paper,a lightweight Partial Separation and Multiscale Fusion Network(PSMFNet)is proposed to alleviate this problem.Specifically,this paper introduces partial convolution(PConv),which reduces the redundant convolution operations throughout the model by separating some of the features of an image while retaining features useful for image reconstruction.Additionally,it is worth noting that the existing methods have not fully utilized the rich feature information,leading to information loss,which reduces the ability to learn feature representations.Inspired by self-attention,this paper develops a multiscale feature fusion block(MFFB),which can better utilize the non-local features of an image.MFFB can learn long-range dependencies from the spatial dimension and extract features from the channel dimension,thereby obtaining more comprehensive and rich feature information.As the role of the MFFB is to capture rich global features,this paper further introduces an efficient inverted residual block(EIRB)to supplement the local feature extraction ability of PSMFNet.A comprehensive analysis of the experimental results shows that PSMFNet maintains a better performance with fewer parameters than the state-of-the-art models.展开更多
Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion...Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models.展开更多
Automatic segmentation of medical images provides a reliable scientific basis for disease diagnosis and analysis.Notably,most existing methods that combine the strengths of convolutional neural networks(CNNs)and Trans...Automatic segmentation of medical images provides a reliable scientific basis for disease diagnosis and analysis.Notably,most existing methods that combine the strengths of convolutional neural networks(CNNs)and Transformers have made significant progress.However,there are some limitations in the current integration of CNN and Transformer technology in two key aspects.Firstly,most methods either overlook or fail to fully incorporate the complementary nature between local and global features.Secondly,the significance of integrating the multiscale encoder features from the dual-branch network to enhance the decoding features is often disregarded in methods that combine CNN and Transformer.To address this issue,we present a groundbreaking dual-branch cross-attention fusion network(DCFNet),which efficiently combines the power of Swin Transformer and CNN to generate complementary global and local features.We then designed the Feature Cross-Fusion(FCF)module to efficiently fuse local and global features.In the FCF,the utilization of the Channel-wise Cross-fusion Transformer(CCT)serves the purpose of aggregatingmulti-scale features,and the Feature FusionModule(FFM)is employed to effectively aggregate dual-branch prominent feature regions from the spatial perspective.Furthermore,within the decoding phase of the dual-branch network,our proposed Channel Attention Block(CAB)aims to emphasize the significance of the channel features between the up-sampled features and the features generated by the FCFmodule to enhance the details of the decoding.Experimental results demonstrate that DCFNet exhibits enhanced accuracy in segmentation performance.Compared to other state-of-the-art(SOTA)methods,our segmentation framework exhibits a superior level of competitiveness.DCFNet’s accurate segmentation of medical images can greatly assist medical professionals in making crucial diagnoses of lesion areas in advance.展开更多
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
The purpose of this study was to address the challenges in predicting and classifying accuracy in modeling Container Dwell Time (CDT) using Artificial Neural Networks (ANN). This objective was driven by the suboptimal...The purpose of this study was to address the challenges in predicting and classifying accuracy in modeling Container Dwell Time (CDT) using Artificial Neural Networks (ANN). This objective was driven by the suboptimal outcomes reported in previous studies and sought to apply an innovative approach to improve these results. To achieve this, the study applied the Fusion of Activation Functions (FAFs) to a substantial dataset. This dataset included 307,594 container records from the Port of Tema from 2014 to 2022, encompassing both import and transit containers. The RandomizedSearchCV algorithm from Python’s Scikit-learn library was utilized in the methodological approach to yield the optimal activation function for prediction accuracy. The results indicated that “ajaLT”, a fusion of the Logistic and Hyperbolic Tangent Activation Functions, provided the best prediction accuracy, reaching a high of 82%. Despite these encouraging findings, it’s crucial to recognize the study’s limitations. While Fusion of Activation Functions is a promising method, further evaluation is necessary across different container types and port operations to ascertain the broader applicability and generalizability of these findings. The original value of this study lies in its innovative application of FAFs to CDT. Unlike previous studies, this research evaluates the method based on prediction accuracy rather than training time. It opens new avenues for machine learning engineers and researchers in applying FAFs to enhance prediction accuracy in CDT modeling, contributing to a previously underexplored area.展开更多
A constructive-pruning hybrid method (CPHM) for radial basis function (RBF) networks is proposed to improve the prediction accuracy of ash fusion temperatures (AFT). The CPHM incorporates the advantages of the c...A constructive-pruning hybrid method (CPHM) for radial basis function (RBF) networks is proposed to improve the prediction accuracy of ash fusion temperatures (AFT). The CPHM incorporates the advantages of the construction algorithm and the pruning algorithm of neural networks, and the training process of the CPHM is divided into two stages: rough tuning and fine tuning. In rough tuning, new hidden units are added to the current network until some performance index is satisfied. In fine tuning, the network structure and the model parameters are further adjusted. And, based on components of coal ash, a model using the CPHM is established to predict the AFT. The results show that the CPHM prediction model is characterized by its high precision, compact network structure, as well as strong generalization ability and robustness.展开更多
The spaceborne precipitation radar onboard the Tropical Rainfall Measuring Mission satellite (TRMM PR) can provide good measurement of the vertical structure of reflectivity, while ground radar (GR) has a relative...The spaceborne precipitation radar onboard the Tropical Rainfall Measuring Mission satellite (TRMM PR) can provide good measurement of the vertical structure of reflectivity, while ground radar (GR) has a relatively high horizontal resolution and greater sensitivity. Fusion of TRMM PR and GR reflectivity data may maximize the advantages from both instruments. In this paper, TRMM PR and GR reflectivity data are fused using a neural network (NN)-based approach. The main steps included are: quality control of TRMM PR and GR reflectivity data; spatiotemporal matchup; GR calibration bias correction; conversion of TRMM PR data from Ku to S band; fusion of TRMM PR and GR reflectivity data with an NN method: interpolation of reflectivity data that are below PR's sensitivity; blind areas compensation with a distance weighting-based merging approach; combination of three types of data: data with the NN method, data below PR's sensitivity and data within compensated blind areas. During the NN fusion step, the TRMM PR data are taken as targets of the training NNs, and gridded GR data after horizontal downsampling at different heights are used as the input. The trained NNs are then used to obtain 3D high-resolution reflectivity from the original GR gridded data. After 3D fusion of the TRMM PR and GR reflectivity data, a more complete and finer-scale 3D radar reflectivity dataset incorporating characteristics from both the TRMM PR and GR observations can be obtained. The fused reflectivity data are evaluated based on a convective precipitation event through comparison with the high resolution TRMM PR and GR data with an interpolation algorithm.展开更多
In wireless sensor networks, target classification differs from that in centralized sensing systems because of the distributed detection, wireless communication and limited resources. We study the classification probl...In wireless sensor networks, target classification differs from that in centralized sensing systems because of the distributed detection, wireless communication and limited resources. We study the classification problem of moving vehicles in wireless sensor networks using acoustic signals emitted from vehicles. Three algorithms including wavelet decomposition, weighted k-nearest-neighbor and Dempster-Shafer theory are combined in this paper. Finally, we use real world experimental data to validate the classification methods. The result shows that wavelet based feature extraction method can extract stable features from acoustic signals. By fusion with Dempster's rule, the classification performance is improved.展开更多
In practical multi-sensor information fusion systems, there exists uncertainty about the network structure, active state of sensors, and information itself (including fuzziness, randomness, incompleteness as well as ...In practical multi-sensor information fusion systems, there exists uncertainty about the network structure, active state of sensors, and information itself (including fuzziness, randomness, incompleteness as well as roughness, etc). Hence it requires investigating the problem of uncertain information fusion. Robust learning algorithm which adapts to complex environment and the fuzzy inference algorithm which disposes fuzzy information are explored to solve the problem. Based on the fusion technology of neural networks and fuzzy inference algorithm, a multi-sensor uncertain information fusion system is modeled. Also RANFIS learning algorithm and fusing weight synthesized inference algorithm are developed from the ANFIS algorithm according to the concept of robust neural networks. This fusion system mainly consists of RANFIS confidence estimator, fusing weight synthesized inference knowledge base and weighted fusion section. The simulation result demonstrates that the proposed fusion model and algorithm have the capability of uncertain information fusion, thus is obviously advantageous compared with the conventional Kalman weighted fusion algorithm.展开更多
Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recogniti...Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.展开更多
In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid...In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy.展开更多
Obtaining comprehensive and accurate information is very important in intelligent tragic system (ITS). In ITS, the GPS floating car system is an important approach for traffic data acquisition. However, in this syst...Obtaining comprehensive and accurate information is very important in intelligent tragic system (ITS). In ITS, the GPS floating car system is an important approach for traffic data acquisition. However, in this system, the GPS blind areas caused by tall buildings or tunnels could affect the acquisition of tragic information and depress the system performance. Aiming at this problem, a novel method employing a back propagation (BP) neural network is developed to estimate the traffic speed in the GPS blind areas. When the speed of one road section is lost, the speed of its related road sections can be used to estimate its speed. The complete historical data of these road sections are used to train the neural network, using Levenberg-Marquardt learning algorithm. Then, the current speed of the related roads is used by the trained neural network to get the speed of the road section without GPS signal. We compare the speed of the road section estimated by our method with the real speed of this road section, and the experimental results show that the proposed method of traffic speed estimation is very effective.展开更多
The concepts of information fusion and the basic principles of neural networks are introduced. Neural net-works were introduced as a way of building an information fusion model in a coal mine monitoring system. This a...The concepts of information fusion and the basic principles of neural networks are introduced. Neural net-works were introduced as a way of building an information fusion model in a coal mine monitoring system. This assures the accurate transmission of the multi-sensor information that comes from the coal mine monitoring systems. The in-formation fusion mode was analyzed. An algorithm was designed based on this analysis and some simulation results were given. Finally,conclusions that could provide auxiliary decision making information to the coal mine dispatching officers were presented.展开更多
To improve the quality of the infrared image and enhance the information of the object,a dual band infrared image fusion method based on feature extraction and a novel multiple pulse coupled neural network(multi-PCNN)...To improve the quality of the infrared image and enhance the information of the object,a dual band infrared image fusion method based on feature extraction and a novel multiple pulse coupled neural network(multi-PCNN)is proposed.In this multi-PCNN fusion scheme,the auxiliary PCNN which captures the characteristics of feature image extracting from the infrared image is used to modulate the main PCNN,whose input could be original infrared image.Meanwhile,to make the PCNN fusion effect consistent with the human vision system,Laplacian energy is adopted to obtain the value of adaptive linking strength in PCNN.After that,the original dual band infrared images are reconstructed by using a weight fusion rule with the fire mapping images generated by the main PCNNs to obtain the fused image.Compared to wavelet transforms,Laplacian pyramids and traditional multi-PCNNs,fusion images based on our method have more information,rich details and clear edges.展开更多
Multiple complex networks, each with different properties and mutually fused, have the problems that the evolving process is time varying and non-equilibrium, network structures are layered and interlacing, and evolvi...Multiple complex networks, each with different properties and mutually fused, have the problems that the evolving process is time varying and non-equilibrium, network structures are layered and interlacing, and evolving characteristics are difficult to be measured. On that account, a dynamic evolving model of complex network with fusion nodes and overlap edges(CNFNOEs) is proposed. Firstly, we define some related concepts of CNFNOEs, and analyze the conversion process of fusion relationship and hierarchy relationship. According to the property difference of various nodes and edges, fusion nodes and overlap edges are subsequently split, and then the CNFNOEs is transformed to interlacing layered complex networks(ILCN). Secondly,the node degree saturation and attraction factors are defined. On that basis, the evolution algorithm and the local world evolution model for ILCN are put forward. Moreover, four typical situations of nodes evolution are discussed, and the degree distribution law during evolution is analyzed by means of the mean field method.Numerical simulation results show that nodes unreached degree saturation follow the exponential distribution with an error of no more than 6%; nodes reached degree saturation follow the distribution of their connection capacities with an error of no more than 3%; network weaving coefficients have a positive correlation with the highest probability of new node and initial number of connected edges. The results have verified the feasibility and effectiveness of the model, which provides a new idea and method for exploring CNFNOE's evolving process and law. Also, the model has good application prospects in structure and dynamics research of transportation network, communication network, social contact network,etc.展开更多
Traditional techniques based on image fusion are arduous in integrating complementary or heterogeneous infrared(IR)/visible(VS)images.Dissimilarities in various kind of features in these images are vital to preserve i...Traditional techniques based on image fusion are arduous in integrating complementary or heterogeneous infrared(IR)/visible(VS)images.Dissimilarities in various kind of features in these images are vital to preserve in the single fused image.Hence,simultaneous preservation of both the aspects at the same time is a challenging task.However,most of the existing methods utilize the manual extraction of features;and manual complicated designing of fusion rules resulted in a blurry artifact in the fused image.Therefore,this study has proposed a hybrid algorithm for the integration of multi-features among two heterogeneous images.Firstly,fuzzification of two IR/VS images has been done by feeding it to the fuzzy sets to remove the uncertainty present in the background and object of interest of the image.Secondly,images have been learned by two parallel branches of the siamese convolutional neural network(CNN)to extract prominent features from the images as well as high-frequency information to produce focus maps containing source image information.Finally,the obtained focused maps which contained the detailed integrated information are directly mapped with the source image via pixelwise strategy to result in fused image.Different parameters have been used to evaluate the performance of the proposed image fusion by achieving 1.008 for mutual information(MI),0.841 for entropy(EG),0.655 for edge information(EI),0.652 for human perception(HP),and 0.980 for image structural similarity(ISS).Experimental results have shown that the proposed technique has attained the best qualitative and quantitative results using 78 publically available images in comparison to the existing discrete cosine transform(DCT),anisotropic diffusion&karhunen-loeve(ADKL),guided filter(GF),random walk(RW),principal component analysis(PCA),and convolutional neural network(CNN)methods.展开更多
In the smart logistics industry,unmanned forklifts that intelligently identify logistics pallets can improve work efficiency in warehousing and transportation and are better than traditional manual forklifts driven by...In the smart logistics industry,unmanned forklifts that intelligently identify logistics pallets can improve work efficiency in warehousing and transportation and are better than traditional manual forklifts driven by humans.Therefore,they play a critical role in smart warehousing,and semantics segmentation is an effective method to realize the intelligent identification of logistics pallets.However,most current recognition algorithms are ineffective due to the diverse types of pallets,their complex shapes,frequent blockades in production environments,and changing lighting conditions.This paper proposes a novel multi-feature fusion-guided multiscale bidirectional attention(MFMBA)neural network for logistics pallet segmentation.To better predict the foreground category(the pallet)and the background category(the cargo)of a pallet image,our approach extracts three types of features(grayscale,texture,and Hue,Saturation,Value features)and fuses them.The multiscale architecture deals with the problem that the size and shape of the pallet may appear different in the image in the actual,complex environment,which usually makes feature extraction difficult.Our study proposes a multiscale architecture that can extract additional semantic features.Also,since a traditional attention mechanism only assigns attention rights from a single direction,we designed a bidirectional attention mechanism that assigns cross-attention weights to each feature from two directions,horizontally and vertically,significantly improving segmentation.Finally,comparative experimental results show that the precision of the proposed algorithm is 0.53%–8.77%better than that of other methods we compared.展开更多
Faced with the massive amount of online shopping clothing images,how to classify them quickly and accurately is a challenging task in image classification.In this paper,we propose a novel method,named Multi_XMNet,to s...Faced with the massive amount of online shopping clothing images,how to classify them quickly and accurately is a challenging task in image classification.In this paper,we propose a novel method,named Multi_XMNet,to solve the clothing images classification problem.The proposed method mainly consists of two convolution neural network(CNN)branches.One branch extracts multiscale features from the whole expressional image by Multi_X which is designed by improving the Xception network,while the other extracts attention mechanism features from the whole expressional image by MobileNetV3-small network.Both multiscale and attention mechanism features are aggregated before making classification.Additionally,in the training stage,global average pooling(GAP),convolutional layers,and softmax classifiers are used instead of the fully connected layer to classify the final features,which speed up model training and alleviate the problem of overfitting caused by too many parameters.Experimental comparisons are made in the public DeepFashion dataset.The experimental results show that the classification accuracy of this method is 95.38%,which is better than InceptionV3,Xception and InceptionV3_Xception by 5.58%,3.32%,and 2.22%,respectively.The proposed Multi_XMNet image classification model can help enterprises and researchers in the field of clothing e-commerce to automaticly,efficiently and accurately classify massive clothing images.展开更多
基金financially supported by the Key Project of National Natural Science Foundation of China (No. 41930431)the Project of National Natural Science Foundation of China (Nos. 41904121, 41804133, and 41974116)Joint Guidance Project of Natural Science Foundation of Heilongjiang Province (No. LH2020D006)
文摘Based on the CNN-LSTM fusion deep neural network,this paper proposes a seismic velocity model building method that can simultaneously estimate the root mean square(RMS)velocity and interval velocity from the common-midpoint(CMP)gather.In the proposed method,a convolutional neural network(CNN)Encoder and two long short-term memory networks(LSTMs)are used to extract spatial and temporal features from seismic signals,respectively,and a CNN Decoder is used to recover RMS velocity and interval velocity of underground media from various feature vectors.To address the problems of unstable gradients and easily fall into a local minimum in the deep neural network training process,we propose to use Kaiming normal initialization with zero negative slopes of rectifi ed units and to adjust the network learning process by optimizing the mean square error(MSE)loss function with the introduction of a freezing factor.The experiments on testing dataset show that CNN-LSTM fusion deep neural network can predict RMS velocity as well as interval velocity more accurately,and its inversion accuracy is superior to that of single neural network models.The predictions on the complex structures and Marmousi model are consistent with the true velocity variation trends,and the predictions on fi eld data can eff ectively correct the phase axis,improve the lateral continuity of phase axis and quality of stack section,indicating the eff ectiveness and decent generalization capability of the proposed method.
基金Guangdong Science and Technology Program under Grant No.202206010052Foshan Province R&D Key Project under Grant No.2020001006827Guangdong Academy of Sciences Integrated Industry Technology Innovation Center Action Special Project under Grant No.2022GDASZH-2022010108.
文摘The employment of deep convolutional neural networks has recently contributed to significant progress in single image super-resolution(SISR)research.However,the high computational demands of most SR techniques hinder their applicability to edge devices,despite their satisfactory reconstruction performance.These methods commonly use standard convolutions,which increase the convolutional operation cost of the model.In this paper,a lightweight Partial Separation and Multiscale Fusion Network(PSMFNet)is proposed to alleviate this problem.Specifically,this paper introduces partial convolution(PConv),which reduces the redundant convolution operations throughout the model by separating some of the features of an image while retaining features useful for image reconstruction.Additionally,it is worth noting that the existing methods have not fully utilized the rich feature information,leading to information loss,which reduces the ability to learn feature representations.Inspired by self-attention,this paper develops a multiscale feature fusion block(MFFB),which can better utilize the non-local features of an image.MFFB can learn long-range dependencies from the spatial dimension and extract features from the channel dimension,thereby obtaining more comprehensive and rich feature information.As the role of the MFFB is to capture rich global features,this paper further introduces an efficient inverted residual block(EIRB)to supplement the local feature extraction ability of PSMFNet.A comprehensive analysis of the experimental results shows that PSMFNet maintains a better performance with fewer parameters than the state-of-the-art models.
基金supported by the National Natural Science Foundation of China(No.62302540)with author Fangfang Shan.For more information,please visit their website at https://www.nsfc.gov.cn/(accessed on 31/05/2024)+3 种基金Additionally,it is also funded by the Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020)where Fangfang Shan is an author.Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/(accessed on 31/05/2024)supported by the Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422)for more information,you can visit https://kjt.henan.gov.cn/2022/09-02/2599082.html(accessed on 31/05/2024).
文摘Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models.
基金supported by the National Key R&D Program of China(2018AAA0102100)the National Natural Science Foundation of China(No.62376287)+3 种基金the International Science and Technology Innovation Joint Base of Machine Vision and Medical Image Processing in Hunan Province(2021CB1013)the Key Research and Development Program of Hunan Province(2022SK2054)the Natural Science Foundation of Hunan Province(No.2022JJ30762,2023JJ70016)the 111 Project under Grant(No.B18059).
文摘Automatic segmentation of medical images provides a reliable scientific basis for disease diagnosis and analysis.Notably,most existing methods that combine the strengths of convolutional neural networks(CNNs)and Transformers have made significant progress.However,there are some limitations in the current integration of CNN and Transformer technology in two key aspects.Firstly,most methods either overlook or fail to fully incorporate the complementary nature between local and global features.Secondly,the significance of integrating the multiscale encoder features from the dual-branch network to enhance the decoding features is often disregarded in methods that combine CNN and Transformer.To address this issue,we present a groundbreaking dual-branch cross-attention fusion network(DCFNet),which efficiently combines the power of Swin Transformer and CNN to generate complementary global and local features.We then designed the Feature Cross-Fusion(FCF)module to efficiently fuse local and global features.In the FCF,the utilization of the Channel-wise Cross-fusion Transformer(CCT)serves the purpose of aggregatingmulti-scale features,and the Feature FusionModule(FFM)is employed to effectively aggregate dual-branch prominent feature regions from the spatial perspective.Furthermore,within the decoding phase of the dual-branch network,our proposed Channel Attention Block(CAB)aims to emphasize the significance of the channel features between the up-sampled features and the features generated by the FCFmodule to enhance the details of the decoding.Experimental results demonstrate that DCFNet exhibits enhanced accuracy in segmentation performance.Compared to other state-of-the-art(SOTA)methods,our segmentation framework exhibits a superior level of competitiveness.DCFNet’s accurate segmentation of medical images can greatly assist medical professionals in making crucial diagnoses of lesion areas in advance.
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
文摘The purpose of this study was to address the challenges in predicting and classifying accuracy in modeling Container Dwell Time (CDT) using Artificial Neural Networks (ANN). This objective was driven by the suboptimal outcomes reported in previous studies and sought to apply an innovative approach to improve these results. To achieve this, the study applied the Fusion of Activation Functions (FAFs) to a substantial dataset. This dataset included 307,594 container records from the Port of Tema from 2014 to 2022, encompassing both import and transit containers. The RandomizedSearchCV algorithm from Python’s Scikit-learn library was utilized in the methodological approach to yield the optimal activation function for prediction accuracy. The results indicated that “ajaLT”, a fusion of the Logistic and Hyperbolic Tangent Activation Functions, provided the best prediction accuracy, reaching a high of 82%. Despite these encouraging findings, it’s crucial to recognize the study’s limitations. While Fusion of Activation Functions is a promising method, further evaluation is necessary across different container types and port operations to ascertain the broader applicability and generalizability of these findings. The original value of this study lies in its innovative application of FAFs to CDT. Unlike previous studies, this research evaluates the method based on prediction accuracy rather than training time. It opens new avenues for machine learning engineers and researchers in applying FAFs to enhance prediction accuracy in CDT modeling, contributing to a previously underexplored area.
基金The National Natural Science Foundation of China(No.60875035)the Natural Science Foundation of Jiangsu Province(No.BK2008294)the National High Technology Research and Development Program of China(863 Program)(No.2006AA05A107)
文摘A constructive-pruning hybrid method (CPHM) for radial basis function (RBF) networks is proposed to improve the prediction accuracy of ash fusion temperatures (AFT). The CPHM incorporates the advantages of the construction algorithm and the pruning algorithm of neural networks, and the training process of the CPHM is divided into two stages: rough tuning and fine tuning. In rough tuning, new hidden units are added to the current network until some performance index is satisfied. In fine tuning, the network structure and the model parameters are further adjusted. And, based on components of coal ash, a model using the CPHM is established to predict the AFT. The results show that the CPHM prediction model is characterized by its high precision, compact network structure, as well as strong generalization ability and robustness.
基金supported by funding from the Natural Science Foundation of Jiangsu Province (Grant No. BK20171457)the 2013 Special Fund for Meteorological Scientific Research in the Public Interest (Grant No. GYHY201306078)+1 种基金the National Natural Science Foundation of China (Grant No. 41301399)Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)
文摘The spaceborne precipitation radar onboard the Tropical Rainfall Measuring Mission satellite (TRMM PR) can provide good measurement of the vertical structure of reflectivity, while ground radar (GR) has a relatively high horizontal resolution and greater sensitivity. Fusion of TRMM PR and GR reflectivity data may maximize the advantages from both instruments. In this paper, TRMM PR and GR reflectivity data are fused using a neural network (NN)-based approach. The main steps included are: quality control of TRMM PR and GR reflectivity data; spatiotemporal matchup; GR calibration bias correction; conversion of TRMM PR data from Ku to S band; fusion of TRMM PR and GR reflectivity data with an NN method: interpolation of reflectivity data that are below PR's sensitivity; blind areas compensation with a distance weighting-based merging approach; combination of three types of data: data with the NN method, data below PR's sensitivity and data within compensated blind areas. During the NN fusion step, the TRMM PR data are taken as targets of the training NNs, and gridded GR data after horizontal downsampling at different heights are used as the input. The trained NNs are then used to obtain 3D high-resolution reflectivity from the original GR gridded data. After 3D fusion of the TRMM PR and GR reflectivity data, a more complete and finer-scale 3D radar reflectivity dataset incorporating characteristics from both the TRMM PR and GR observations can be obtained. The fused reflectivity data are evaluated based on a convective precipitation event through comparison with the high resolution TRMM PR and GR data with an interpolation algorithm.
基金Supported in part by Science & Technology Department of Shanghai (05dz15004)
文摘In wireless sensor networks, target classification differs from that in centralized sensing systems because of the distributed detection, wireless communication and limited resources. We study the classification problem of moving vehicles in wireless sensor networks using acoustic signals emitted from vehicles. Three algorithms including wavelet decomposition, weighted k-nearest-neighbor and Dempster-Shafer theory are combined in this paper. Finally, we use real world experimental data to validate the classification methods. The result shows that wavelet based feature extraction method can extract stable features from acoustic signals. By fusion with Dempster's rule, the classification performance is improved.
基金This project was supported by the National Natural Science Foundation of China (60572038)
文摘In practical multi-sensor information fusion systems, there exists uncertainty about the network structure, active state of sensors, and information itself (including fuzziness, randomness, incompleteness as well as roughness, etc). Hence it requires investigating the problem of uncertain information fusion. Robust learning algorithm which adapts to complex environment and the fuzzy inference algorithm which disposes fuzzy information are explored to solve the problem. Based on the fusion technology of neural networks and fuzzy inference algorithm, a multi-sensor uncertain information fusion system is modeled. Also RANFIS learning algorithm and fusing weight synthesized inference algorithm are developed from the ANFIS algorithm according to the concept of robust neural networks. This fusion system mainly consists of RANFIS confidence estimator, fusing weight synthesized inference knowledge base and weighted fusion section. The simulation result demonstrates that the proposed fusion model and algorithm have the capability of uncertain information fusion, thus is obviously advantageous compared with the conventional Kalman weighted fusion algorithm.
文摘Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.
基金The National Natural Science Foundation of China(No.61603091)。
文摘In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy.
基金funded by National Key Technology R&D Program of China (No.2006BAG01A03)
文摘Obtaining comprehensive and accurate information is very important in intelligent tragic system (ITS). In ITS, the GPS floating car system is an important approach for traffic data acquisition. However, in this system, the GPS blind areas caused by tall buildings or tunnels could affect the acquisition of tragic information and depress the system performance. Aiming at this problem, a novel method employing a back propagation (BP) neural network is developed to estimate the traffic speed in the GPS blind areas. When the speed of one road section is lost, the speed of its related road sections can be used to estimate its speed. The complete historical data of these road sections are used to train the neural network, using Levenberg-Marquardt learning algorithm. Then, the current speed of the related roads is used by the trained neural network to get the speed of the road section without GPS signal. We compare the speed of the road section estimated by our method with the real speed of this road section, and the experimental results show that the proposed method of traffic speed estimation is very effective.
基金project BK2001073 supported by Jiangsu Province Natural Science Foundation
文摘The concepts of information fusion and the basic principles of neural networks are introduced. Neural net-works were introduced as a way of building an information fusion model in a coal mine monitoring system. This assures the accurate transmission of the multi-sensor information that comes from the coal mine monitoring systems. The in-formation fusion mode was analyzed. An algorithm was designed based on this analysis and some simulation results were given. Finally,conclusions that could provide auxiliary decision making information to the coal mine dispatching officers were presented.
基金Supported by the National Natural Science Foundation of China(60905012,60572058)
文摘To improve the quality of the infrared image and enhance the information of the object,a dual band infrared image fusion method based on feature extraction and a novel multiple pulse coupled neural network(multi-PCNN)is proposed.In this multi-PCNN fusion scheme,the auxiliary PCNN which captures the characteristics of feature image extracting from the infrared image is used to modulate the main PCNN,whose input could be original infrared image.Meanwhile,to make the PCNN fusion effect consistent with the human vision system,Laplacian energy is adopted to obtain the value of adaptive linking strength in PCNN.After that,the original dual band infrared images are reconstructed by using a weight fusion rule with the fire mapping images generated by the main PCNNs to obtain the fused image.Compared to wavelet transforms,Laplacian pyramids and traditional multi-PCNNs,fusion images based on our method have more information,rich details and clear edges.
基金supported by the National Natural Science Foundation of China(615730176140149961174162)
文摘Multiple complex networks, each with different properties and mutually fused, have the problems that the evolving process is time varying and non-equilibrium, network structures are layered and interlacing, and evolving characteristics are difficult to be measured. On that account, a dynamic evolving model of complex network with fusion nodes and overlap edges(CNFNOEs) is proposed. Firstly, we define some related concepts of CNFNOEs, and analyze the conversion process of fusion relationship and hierarchy relationship. According to the property difference of various nodes and edges, fusion nodes and overlap edges are subsequently split, and then the CNFNOEs is transformed to interlacing layered complex networks(ILCN). Secondly,the node degree saturation and attraction factors are defined. On that basis, the evolution algorithm and the local world evolution model for ILCN are put forward. Moreover, four typical situations of nodes evolution are discussed, and the degree distribution law during evolution is analyzed by means of the mean field method.Numerical simulation results show that nodes unreached degree saturation follow the exponential distribution with an error of no more than 6%; nodes reached degree saturation follow the distribution of their connection capacities with an error of no more than 3%; network weaving coefficients have a positive correlation with the highest probability of new node and initial number of connected edges. The results have verified the feasibility and effectiveness of the model, which provides a new idea and method for exploring CNFNOE's evolving process and law. Also, the model has good application prospects in structure and dynamics research of transportation network, communication network, social contact network,etc.
文摘Traditional techniques based on image fusion are arduous in integrating complementary or heterogeneous infrared(IR)/visible(VS)images.Dissimilarities in various kind of features in these images are vital to preserve in the single fused image.Hence,simultaneous preservation of both the aspects at the same time is a challenging task.However,most of the existing methods utilize the manual extraction of features;and manual complicated designing of fusion rules resulted in a blurry artifact in the fused image.Therefore,this study has proposed a hybrid algorithm for the integration of multi-features among two heterogeneous images.Firstly,fuzzification of two IR/VS images has been done by feeding it to the fuzzy sets to remove the uncertainty present in the background and object of interest of the image.Secondly,images have been learned by two parallel branches of the siamese convolutional neural network(CNN)to extract prominent features from the images as well as high-frequency information to produce focus maps containing source image information.Finally,the obtained focused maps which contained the detailed integrated information are directly mapped with the source image via pixelwise strategy to result in fused image.Different parameters have been used to evaluate the performance of the proposed image fusion by achieving 1.008 for mutual information(MI),0.841 for entropy(EG),0.655 for edge information(EI),0.652 for human perception(HP),and 0.980 for image structural similarity(ISS).Experimental results have shown that the proposed technique has attained the best qualitative and quantitative results using 78 publically available images in comparison to the existing discrete cosine transform(DCT),anisotropic diffusion&karhunen-loeve(ADKL),guided filter(GF),random walk(RW),principal component analysis(PCA),and convolutional neural network(CNN)methods.
基金supported by the Postgraduate Scientific Research Innovation Project of Hunan Province under Grant QL20210212the Scientific Innovation Fund for Postgraduates of Central South University of Forestry and Technology under Grant CX202102043.
文摘In the smart logistics industry,unmanned forklifts that intelligently identify logistics pallets can improve work efficiency in warehousing and transportation and are better than traditional manual forklifts driven by humans.Therefore,they play a critical role in smart warehousing,and semantics segmentation is an effective method to realize the intelligent identification of logistics pallets.However,most current recognition algorithms are ineffective due to the diverse types of pallets,their complex shapes,frequent blockades in production environments,and changing lighting conditions.This paper proposes a novel multi-feature fusion-guided multiscale bidirectional attention(MFMBA)neural network for logistics pallet segmentation.To better predict the foreground category(the pallet)and the background category(the cargo)of a pallet image,our approach extracts three types of features(grayscale,texture,and Hue,Saturation,Value features)and fuses them.The multiscale architecture deals with the problem that the size and shape of the pallet may appear different in the image in the actual,complex environment,which usually makes feature extraction difficult.Our study proposes a multiscale architecture that can extract additional semantic features.Also,since a traditional attention mechanism only assigns attention rights from a single direction,we designed a bidirectional attention mechanism that assigns cross-attention weights to each feature from two directions,horizontally and vertically,significantly improving segmentation.Finally,comparative experimental results show that the precision of the proposed algorithm is 0.53%–8.77%better than that of other methods we compared.
基金Fundamental Research Funds for the Central Universities of Ministry of Education of China(No.19D111201)。
文摘Faced with the massive amount of online shopping clothing images,how to classify them quickly and accurately is a challenging task in image classification.In this paper,we propose a novel method,named Multi_XMNet,to solve the clothing images classification problem.The proposed method mainly consists of two convolution neural network(CNN)branches.One branch extracts multiscale features from the whole expressional image by Multi_X which is designed by improving the Xception network,while the other extracts attention mechanism features from the whole expressional image by MobileNetV3-small network.Both multiscale and attention mechanism features are aggregated before making classification.Additionally,in the training stage,global average pooling(GAP),convolutional layers,and softmax classifiers are used instead of the fully connected layer to classify the final features,which speed up model training and alleviate the problem of overfitting caused by too many parameters.Experimental comparisons are made in the public DeepFashion dataset.The experimental results show that the classification accuracy of this method is 95.38%,which is better than InceptionV3,Xception and InceptionV3_Xception by 5.58%,3.32%,and 2.22%,respectively.The proposed Multi_XMNet image classification model can help enterprises and researchers in the field of clothing e-commerce to automaticly,efficiently and accurately classify massive clothing images.