Pose-invariant facial expression recognition(FER)is an active but challenging research topic in computer vision.Especially with the involvement of diverse observation angles,FER makes the training parameter models inc...Pose-invariant facial expression recognition(FER)is an active but challenging research topic in computer vision.Especially with the involvement of diverse observation angles,FER makes the training parameter models inconsistent from one view to another.This study develops a deep global multiple-scale and local patches attention(GMS-LPA)dual-branch network for pose-invariant FER to weaken the influence of pose variation and selfocclusion on recognition accuracy.In this research,the designed GMS-LPA network contains four main parts,i.e.,the feature extraction module,the global multiple-scale(GMS)module,the local patches attention(LPA)module,and the model-level fusion model.The feature extraction module is designed to extract and normalize texture information to the same size.The GMS model can extract deep global features with different receptive fields,releasing the sensitivity of deeper convolution layers to pose-variant and self-occlusion.The LPA module is built to force the network to focus on local salient features,which can lower the effect of pose variation and self-occlusion on recognition results.Subsequently,the extracted features are fused with a model-level strategy to improve recognition accuracy.Extensive experimentswere conducted on four public databases,and the recognition results demonstrated the feasibility and validity of the proposed methods.展开更多
Early screening of diabetes retinopathy(DR)plays an important role in preventing irreversible blindness.Existing research has failed to fully explore effective DR lesion information in fundus maps.Besides,traditional ...Early screening of diabetes retinopathy(DR)plays an important role in preventing irreversible blindness.Existing research has failed to fully explore effective DR lesion information in fundus maps.Besides,traditional attention schemes have not considered the impact of lesion type differences on grading,resulting in unreasonable extraction of important lesion features.Therefore,this paper proposes a DR diagnosis scheme that integrates a multi-level patch attention generator(MPAG)and a lesion localization module(LLM).Firstly,MPAGis used to predict patches of different sizes and generate a weighted attention map based on the prediction score and the types of lesions contained in the patches,fully considering the impact of lesion type differences on grading,solving the problem that the attention maps of lesions cannot be further refined and then adapted to the final DR diagnosis task.Secondly,the LLM generates a global attention map based on localization.Finally,the weighted attention map and global attention map are weighted with the fundus map to fully explore effective DR lesion information and increase the attention of the classification network to lesion details.This paper demonstrates the effectiveness of the proposed method through extensive experiments on the public DDR dataset,obtaining an accuracy of 0.8064.展开更多
The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregula...The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field.展开更多
Today,with the rapid development of the internet,a large amount of information often accompanies the rapid transmission of disease outbreaks,and increasing numbers of scholars are studying the relationship between inf...Today,with the rapid development of the internet,a large amount of information often accompanies the rapid transmission of disease outbreaks,and increasing numbers of scholars are studying the relationship between information and the disease transmission process using complex networks.In fact,the disease transmission process is very complex.Besides this information,there will often be individual behavioral measures and other factors to consider.Most of the previous research has aimed to establish a two-layer network model to consider the impact of information on the transmission process of disease,rarely divided into information and behavior,respectively.To carry out a more in-depth analysis of the disease transmission process and the intrinsic influencing mechanism,this paper divides information and behavior into two layers and proposes the establishment of a complex network to study the dynamic co-evolution of information diffusion,vaccination behavior,and disease transmission.This is achieved by considering four influential relationships between adjacent layers in multilayer networks.In the information layer,the diffusion process of negative information is described,and the feedback effects of local and global vaccination are considered.In the behavioral layer,an individual's vaccination behavior is described,and the probability of an individual receiving a vaccination is influenced by two factors:the influence of negative information,and the influence of local and global disease severity.In the disease layer,individual susceptibility is considered to be influenced by vaccination behavior.The state transition equations are derived using the micro Markov chain approach(MMCA),and disease prevalence thresholds are obtained.It is demonstrated through simulation experiments that the negative information diffusion is less influenced by local vaccination behavior,and is mainly influenced by global vaccination behavior;vaccination behavior is mainly influenced by local disease conditions,and is less influenced by global disease conditions;the disease transmission threshold increases with the increasing vaccination rate;and the scale of disease transmission increases with the increasing negative information diffusion rate and decreases with the increasing vaccination rate.Finally,it is found that when individual vaccination behavior considers both the influence of negative information and disease,it can increase the disease transmission threshold and reduce the scale of disease transmission.Therefore,we should resist the diffusion of negative information,increase vaccination proportions,and take appropriate protective measures in time.展开更多
The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance g...The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance gap between the aerial-view and the street-view images brings a huge challenge against this task.In this paper,we propose a novel multiscale attention encoder to capture the multiscale contextual information of the aerial/street-view images.To bridge the domain gap between these two view images,we first use an inverse polar transform to make the street-view images approximately aligned with the aerial-view images.Then,the explored multiscale attention encoder is applied to convert the image into feature representation with the guidance of the learnt multiscale information.Finally,we propose a novel global mining strategy to enable the network to pay more attention to hard negative exemplars.Experiments on standard benchmark datasets show that our approach obtains 81.39%top-1 recall rate on the CVUSA dataset and 71.52%on the CVACT dataset,achieving the state-of-the-art performance and outperforming most of the existing methods significantly.展开更多
Second-generation high-temperature superconducting(HTS)conductors,specifically rare earth-barium-copper-oxide(REBCO)coated conductor(CC)tapes,are promising candidates for high-energy and high-field superconducting app...Second-generation high-temperature superconducting(HTS)conductors,specifically rare earth-barium-copper-oxide(REBCO)coated conductor(CC)tapes,are promising candidates for high-energy and high-field superconducting applications.With respect to epoxy-impregnated REBCO composite magnets that comprise multilayer components,the thermomechanical characteristics of each component differ considerably under extremely low temperatures and strong electromagnetic fields.Traditional numerical models include homogenized orthotropic models,which simplify overall field calculation but miss detailed multi-physics aspects,and full refinement(FR)ones that are thorough but computationally demanding.Herein,we propose an extended multi-scale approach for analyzing the multi-field characteristics of an epoxy-impregnated composite magnet assembled by HTS pancake coils.This approach combines a global homogenization(GH)scheme based on the homogenized electromagnetic T-A model,a method for solving Maxwell's equations for superconducting materials based on the current vector potential T and the magnetic field vector potential A,and a homogenized orthotropic thermoelastic model to assess the electromagnetic and thermoelastic properties at the macroscopic scale.We then identify“dangerous regions”at the macroscopic scale and obtain finer details using a local refinement(LR)scheme to capture the responses of each component material in the HTS composite tapes at the mesoscopic scale.The results of the present GH-LR multi-scale approach agree well with those of the FR scheme and the experimental data in the literature,indicating that the present approach is accurate and efficient.The proposed GH-LR multi-scale approach can serve as a valuable tool for evaluating the risk of failure in large-scale HTS composite magnets.展开更多
Indoor localization methods can help many sectors,such as healthcare centers,smart homes,museums,warehouses,and retail malls,improve their service areas.As a result,it is crucial to look for low-cost methods that can ...Indoor localization methods can help many sectors,such as healthcare centers,smart homes,museums,warehouses,and retail malls,improve their service areas.As a result,it is crucial to look for low-cost methods that can provide exact localization in indoor locations.In this context,imagebased localization methods can play an important role in estimating both the position and the orientation of cameras regarding an object.Image-based localization faces many issues,such as image scale and rotation variance.Also,image-based localization’s accuracy and speed(latency)are two critical factors.This paper proposes an efficient 6-DoF deep-learning model for image-based localization.This model incorporates the channel attention module and the Scale PyramidModule(SPM).It not only enhances accuracy but also ensures the model’s real-time performance.In complex scenes,a channel attention module is employed to distinguish between the textures of the foregrounds and backgrounds.Our model adapted an SPM,a feature pyramid module for dealing with image scale and rotation variance issues.Furthermore,the proposed model employs two regressions(two fully connected layers),one for position and the other for orientation,which increases outcome accuracy.Experiments on standard indoor and outdoor datasets show that the proposed model has a significantly lower Mean Squared Error(MSE)for both position and orientation.On the indoor 7-Scenes dataset,the MSE for the position is reduced to 0.19 m and 6.25°for the orientation.Furthermore,on the outdoor Cambridge landmarks dataset,the MSE for the position is reduced to 0.63 m and 2.03°for the orientation.According to the findings,the proposed approach is superior and more successful than the baseline methods.展开更多
Constrained multi-objective optimization problems(CMOPs) include the optimization of objective functions and the satisfaction of constraint conditions, which challenge the solvers.To solve CMOPs, constrained multi-obj...Constrained multi-objective optimization problems(CMOPs) include the optimization of objective functions and the satisfaction of constraint conditions, which challenge the solvers.To solve CMOPs, constrained multi-objective evolutionary algorithms(CMOEAs) have been developed. However, most of them tend to converge into local areas due to the loss of diversity. Evolutionary multitasking(EMT) is new model of solving complex optimization problems, through the knowledge transfer between the source task and other related tasks. Inspired by EMT, this paper develops a new EMT-based CMOEA to solve CMOPs, in which the main task, a global auxiliary task, and a local auxiliary task are created and optimized by one specific population respectively. The main task focuses on finding the feasible Pareto front(PF), and global and local auxiliary tasks are used to respectively enhance global and local diversity. Moreover, the global auxiliary task is used to implement the global search by ignoring constraints, so as to help the population of the main task pass through infeasible obstacles. The local auxiliary task is used to provide local diversity around the population of the main task, so as to exploit promising regions. Through the knowledge transfer among the three tasks, the search ability of the population of the main task will be significantly improved. Compared with other state-of-the-art CMOEAs, the experimental results on three benchmark test suites demonstrate the superior or competitive performance of the proposed CMOEA.展开更多
Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of int...Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of intelligent transportation system.Most existing vehicle re-identification models adopt the joint learning of global and local features.However,they directly use the extracted global features,resulting in insufficient feature expression.Moreover,local features are primarily obtained through advanced annotation and complex attention mechanisms,which require additional costs.To solve this issue,a multi-feature learning model with enhanced local attention for vehicle re-identification(MFELA)is proposed in this paper.The model consists of global and local branches.The global branch utilizes both middle and highlevel semantic features of ResNet50 to enhance the global representation capability.In addition,multi-scale pooling operations are used to obtain multiscale information.While the local branch utilizes the proposed Region Batch Dropblock(RBD),which encourages the model to learn discriminative features for different local regions and simultaneously drops corresponding same areas randomly in a batch during training to enhance the attention to local regions.Then features from both branches are combined to provide a more comprehensive and distinctive feature representation.Extensive experiments on VeRi-776 and VehicleID datasets prove that our method has excellent performance.展开更多
In this paper,we develop a novel global-attentionbased neural network(GANN)for vision language intelligence,specifically,image captioning(language description of a given image).As many previous works,the encoder-decod...In this paper,we develop a novel global-attentionbased neural network(GANN)for vision language intelligence,specifically,image captioning(language description of a given image).As many previous works,the encoder-decoder framework is adopted in our proposed model,in which the encoder is responsible for encoding the region proposal features and extracting global caption feature based on a specially designed module of predicting the caption objects,and the decoder generates captions by taking the obtained global caption feature along with the encoded visual features as inputs for each attention head of the decoder layer.The global caption feature is introduced for the purpose of exploring the latent contributions of region proposals for image captioning,and further helping the decoder better focus on the most relevant proposals so as to extract more accurate visual feature in each time step of caption generation.Our GANN is implemented by incorporating the global caption feature into the attention weight calculation phase in the word predication process in each head of the decoder layer.In our experiments,we qualitatively analyzed the proposed model,and quantitatively evaluated several state-of-the-art schemes with GANN on the MS-COCO dataset.Experimental results demonstrate the effectiveness of the proposed global attention mechanism for image captioning.展开更多
Identifying influential nodes in complex networks and ranking their importance plays an important role in many fields such as public opinion analysis, marketing, epidemic prevention and control. To solve the issue of ...Identifying influential nodes in complex networks and ranking their importance plays an important role in many fields such as public opinion analysis, marketing, epidemic prevention and control. To solve the issue of the existing node centrality measure only considering the specific statistical feature of a single dimension, a SLGC model is proposed that combines a node’s self-influence, its local neighborhood influence, and global influence to identify influential nodes in the network. The exponential function of e is introduced to measure the node’s self-influence;in the local neighborhood,the node’s one-hop neighboring nodes and two-hop neighboring nodes are considered, while the information entropy is introduced to measure the node’s local influence;the topological position of the node in the network and the shortest path between nodes are considered to measure the node’s global influence. To demonstrate the effectiveness of the proposed model, extensive comparison experiments are conducted with eight existing node centrality measures on six real network data sets using node differentiation ability experiments, susceptible–infected–recovered(SIR) model and network efficiency as evaluation criteria. The experimental results show that the method can identify influential nodes in complex networks more accurately.展开更多
Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing cloth...Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing clothing parsing algorithms, this paper proposes an enhanced positional attention module(EPAM) to collect positional information in the vertical direction of each pixel, and an efficient global prior module(GPM) to aggregate contextual information from different sub-regions. The EPAM and GPM based residual network(EG-ResNet) could effectively exploit the intrinsic features of clothing images while capturing information between different scales and sub-regions. Experimental results show that the proposed EG-ResNet achieves promising performance in clothing parsing of the colorful fashion parsing dataset(CFPD)(51.12% of mean Intersection over Union(mIoU) and 92.79% of pixel-wise accuracy(PA)) compared with other state-of-the-art methods.展开更多
Underground energy and resource development,deep underground energy storage and other projects involve the global stability of multiple interconnected cavern groups under internal and external dynamic disturbances.An ...Underground energy and resource development,deep underground energy storage and other projects involve the global stability of multiple interconnected cavern groups under internal and external dynamic disturbances.An evaluation method of the global stability coefficient of underground caverns based on static overload and dynamic overload was proposed.Firstly,the global failure criterion for caverns was defined based on its band connection of plastic-strain between multi-caverns.Then,overloading calculation of the boundary geostress and seismic intensity on the caverns model was carried out,and the critical unstable state of multi-caverns can be identified,if the plastic-strain band appeared between caverns during these overloading processes.Thus,the global stability coefficient for the multi-caverns under static loading and earthquake was obtained based on the corresponding overloading coefficient.Practical analysis for the Yingliangbao(YLB)hydraulic caverns indicated that this method can not only effectively obtain the global stability coefficient of caverns under static and dynamic earthquake conditions,but also identify the caverns’high-risk zone of local instability through localized plastic strain of surrounding rock.This study can provide some reference for the layout design and seismic optimization of underground cavern group.展开更多
As image manipulation technology advances rapidly,the malicious use of image tampering has alarmingly escalated,posing a significant threat to social stability.In the realm of image tampering localization,accurately l...As image manipulation technology advances rapidly,the malicious use of image tampering has alarmingly escalated,posing a significant threat to social stability.In the realm of image tampering localization,accurately localizing limited samples,multiple types,and various sizes of regions remains a multitude of challenges.These issues impede the model’s universality and generalization capability and detrimentally affect its performance.To tackle these issues,we propose FL-MobileViT-an improved MobileViT model devised for image tampering localization.Our proposed model utilizes a dual-stream architecture that independently processes the RGB and noise domain,and captures richer traces of tampering through dual-stream integration.Meanwhile,the model incorporating the Focused Linear Attention mechanism within the lightweight network(MobileViT).This substitution significantly diminishes computational complexity and resolves homogeneity problems associated with traditional Transformer attention mechanisms,enhancing feature extraction diversity and improving the model’s localization performance.To comprehensively fuse the generated results from both feature extractors,we introduce the ASPP architecture for multi-scale feature fusion.This facilitates a more precise localization of tampered regions of various sizes.Furthermore,to bolster the model’s generalization ability,we adopt a contrastive learning method and devise a joint optimization training strategy that leverages fused features and captures the disparities in feature distribution in tampered images.This strategy enables the learning of contrastive loss at various stages of the feature extractor and employs it as an additional constraint condition in conjunction with cross-entropy loss.As a result,overfitting issues are effectively alleviated,and the differentiation between tampered and untampered regions is enhanced.Experimental evaluations on five benchmark datasets(IMD-20,CASIA,NIST-16,Columbia and Coverage)validate the effectiveness of our proposed model.The meticulously calibrated FL-MobileViT model consistently outperforms numerous existing general models regarding localization accuracy across diverse datasets,demonstrating superior adaptability.展开更多
The 13th China International Diecasting Congress & Exhibition(CHINA DIECASTING 2018) and 2018 China NonferrousAlloys & Special Casting Exhibition (CHINA NONFERROUS2018) were held concurrently July 18-20 at Shang...The 13th China International Diecasting Congress & Exhibition(CHINA DIECASTING 2018) and 2018 China NonferrousAlloys & Special Casting Exhibition (CHINA NONFERROUS2018) were held concurrently July 18-20 at Shanghai NewInternational Expo Centre. The events were co-organized bythe Foundry Institution of Chinese Mechanical EngineeringSociety (FICMES), Shenyang Zhongzhu Foundry ProductivityPromotion Center Co., Ltd. (FPC) and the State Key Laboratoryof Light Alloys Casting Technologies for High-end Equipment.展开更多
Acoustic source localization(ASL)and sound event detection(SED)are two widely pursued independent research fields.In recent years,in order to achieve a more complete spatial and temporal representation of sound field,...Acoustic source localization(ASL)and sound event detection(SED)are two widely pursued independent research fields.In recent years,in order to achieve a more complete spatial and temporal representation of sound field,sound event localization and detection(SELD)has become a very active research topic.This paper presents a deep learning-based multioverlapping sound event localization and detection algorithm in three-dimensional space.Log-Mel spectrum and generalized cross-correlation spectrum are joined together in channel dimension as input features.These features are classified and regressed in parallel after training by a neural network to obtain sound recognition and localization results respectively.The channel attention mechanism is also introduced in the network to selectively enhance the features containing essential information and suppress the useless features.Finally,a thourough comparison confirms the efficiency and effectiveness of the proposed SELD algorithm.Field experiments show that the proposed algorithm is robust to reverberation and environment and can achieve higher recognition and localization accuracy compared with the baseline method.展开更多
Although there is no consensus with respect to that if exposed Extremely Low Frequency Magnetic Field (ELF-MF) affects human brain activity for guidelines of brain management, there are some evidences related with hum...Although there is no consensus with respect to that if exposed Extremely Low Frequency Magnetic Field (ELF-MF) affects human brain activity for guidelines of brain management, there are some evidences related with human attention changes. Therefore, this study evaluates the effects of 45 Hz sinusoidal ELF (360 μT) at Cz regions, cantered at dominant frequency using Electroencephalogram (EEG) analysis. The purpose was to extracte transient or permanent events as an index for new neurofeedback (NF) system improvement. Twenty-four healthy volunteers aged between 20 and 28 years of age were randomly assigned to one of two groups, which differed in the type of NF training concerning the exposed and non-exposed magnetic field effect on performance in attention tests during NF. Results indicate that theta and beta EEG rhythms variations in exposed group changed more significantly in comparison of traditional NF展开更多
基金supported by the National Natural Science Foundation of China (No.31872399)Advantage Discipline Construction Project (PAPD,No.6-2018)of Jiangsu University。
文摘Pose-invariant facial expression recognition(FER)is an active but challenging research topic in computer vision.Especially with the involvement of diverse observation angles,FER makes the training parameter models inconsistent from one view to another.This study develops a deep global multiple-scale and local patches attention(GMS-LPA)dual-branch network for pose-invariant FER to weaken the influence of pose variation and selfocclusion on recognition accuracy.In this research,the designed GMS-LPA network contains four main parts,i.e.,the feature extraction module,the global multiple-scale(GMS)module,the local patches attention(LPA)module,and the model-level fusion model.The feature extraction module is designed to extract and normalize texture information to the same size.The GMS model can extract deep global features with different receptive fields,releasing the sensitivity of deeper convolution layers to pose-variant and self-occlusion.The LPA module is built to force the network to focus on local salient features,which can lower the effect of pose variation and self-occlusion on recognition results.Subsequently,the extracted features are fused with a model-level strategy to improve recognition accuracy.Extensive experimentswere conducted on four public databases,and the recognition results demonstrated the feasibility and validity of the proposed methods.
基金supported in part by the Research on the Application of Multimodal Artificial Intelligence in Diagnosis and Treatment of Type 2 Diabetes under Grant No.2020SK50910in part by the Hunan Provincial Natural Science Foundation of China under Grant 2023JJ60020.
文摘Early screening of diabetes retinopathy(DR)plays an important role in preventing irreversible blindness.Existing research has failed to fully explore effective DR lesion information in fundus maps.Besides,traditional attention schemes have not considered the impact of lesion type differences on grading,resulting in unreasonable extraction of important lesion features.Therefore,this paper proposes a DR diagnosis scheme that integrates a multi-level patch attention generator(MPAG)and a lesion localization module(LLM).Firstly,MPAGis used to predict patches of different sizes and generate a weighted attention map based on the prediction score and the types of lesions contained in the patches,fully considering the impact of lesion type differences on grading,solving the problem that the attention maps of lesions cannot be further refined and then adapted to the final DR diagnosis task.Secondly,the LLM generates a global attention map based on localization.Finally,the weighted attention map and global attention map are weighted with the fundus map to fully explore effective DR lesion information and increase the attention of the classification network to lesion details.This paper demonstrates the effectiveness of the proposed method through extensive experiments on the public DDR dataset,obtaining an accuracy of 0.8064.
基金The support of this research was by Hubei Provincial Natural Science Foundation(2022CFB449)Science Research Foundation of Education Department of Hubei Province(B2020061),are gratefully acknowledged.
文摘The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field.
基金Project supported by the National Natural Science Foundation of China (Grant Nos. 72174121 and 71774111)the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learningthe Natural Science Foundation of Shanghai (Grant No. 21ZR1444100)
文摘Today,with the rapid development of the internet,a large amount of information often accompanies the rapid transmission of disease outbreaks,and increasing numbers of scholars are studying the relationship between information and the disease transmission process using complex networks.In fact,the disease transmission process is very complex.Besides this information,there will often be individual behavioral measures and other factors to consider.Most of the previous research has aimed to establish a two-layer network model to consider the impact of information on the transmission process of disease,rarely divided into information and behavior,respectively.To carry out a more in-depth analysis of the disease transmission process and the intrinsic influencing mechanism,this paper divides information and behavior into two layers and proposes the establishment of a complex network to study the dynamic co-evolution of information diffusion,vaccination behavior,and disease transmission.This is achieved by considering four influential relationships between adjacent layers in multilayer networks.In the information layer,the diffusion process of negative information is described,and the feedback effects of local and global vaccination are considered.In the behavioral layer,an individual's vaccination behavior is described,and the probability of an individual receiving a vaccination is influenced by two factors:the influence of negative information,and the influence of local and global disease severity.In the disease layer,individual susceptibility is considered to be influenced by vaccination behavior.The state transition equations are derived using the micro Markov chain approach(MMCA),and disease prevalence thresholds are obtained.It is demonstrated through simulation experiments that the negative information diffusion is less influenced by local vaccination behavior,and is mainly influenced by global vaccination behavior;vaccination behavior is mainly influenced by local disease conditions,and is less influenced by global disease conditions;the disease transmission threshold increases with the increasing vaccination rate;and the scale of disease transmission increases with the increasing negative information diffusion rate and decreases with the increasing vaccination rate.Finally,it is found that when individual vaccination behavior considers both the influence of negative information and disease,it can increase the disease transmission threshold and reduce the scale of disease transmission.Therefore,we should resist the diffusion of negative information,increase vaccination proportions,and take appropriate protective measures in time.
基金National Natural Science Foundation of China,Grant/Award Number:62106177supported by the Central University Basic Research Fund of China(No.2042020KF0016)supported by the supercomputing system in the Supercomputing Center of Wuhan University.
文摘The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance gap between the aerial-view and the street-view images brings a huge challenge against this task.In this paper,we propose a novel multiscale attention encoder to capture the multiscale contextual information of the aerial/street-view images.To bridge the domain gap between these two view images,we first use an inverse polar transform to make the street-view images approximately aligned with the aerial-view images.Then,the explored multiscale attention encoder is applied to convert the image into feature representation with the guidance of the learnt multiscale information.Finally,we propose a novel global mining strategy to enable the network to pay more attention to hard negative exemplars.Experiments on standard benchmark datasets show that our approach obtains 81.39%top-1 recall rate on the CVUSA dataset and 71.52%on the CVACT dataset,achieving the state-of-the-art performance and outperforming most of the existing methods significantly.
基金Project supported by the National Natural Science Foundation of China(Nos.11932008 and 12272156)the Fundamental Research Funds for the Central Universities(No.lzujbky-2022-kb06)+1 种基金the Gansu Science and Technology ProgramLanzhou City’s Scientific Research Funding Subsidy to Lanzhou University of China。
文摘Second-generation high-temperature superconducting(HTS)conductors,specifically rare earth-barium-copper-oxide(REBCO)coated conductor(CC)tapes,are promising candidates for high-energy and high-field superconducting applications.With respect to epoxy-impregnated REBCO composite magnets that comprise multilayer components,the thermomechanical characteristics of each component differ considerably under extremely low temperatures and strong electromagnetic fields.Traditional numerical models include homogenized orthotropic models,which simplify overall field calculation but miss detailed multi-physics aspects,and full refinement(FR)ones that are thorough but computationally demanding.Herein,we propose an extended multi-scale approach for analyzing the multi-field characteristics of an epoxy-impregnated composite magnet assembled by HTS pancake coils.This approach combines a global homogenization(GH)scheme based on the homogenized electromagnetic T-A model,a method for solving Maxwell's equations for superconducting materials based on the current vector potential T and the magnetic field vector potential A,and a homogenized orthotropic thermoelastic model to assess the electromagnetic and thermoelastic properties at the macroscopic scale.We then identify“dangerous regions”at the macroscopic scale and obtain finer details using a local refinement(LR)scheme to capture the responses of each component material in the HTS composite tapes at the mesoscopic scale.The results of the present GH-LR multi-scale approach agree well with those of the FR scheme and the experimental data in the literature,indicating that the present approach is accurate and efficient.The proposed GH-LR multi-scale approach can serve as a valuable tool for evaluating the risk of failure in large-scale HTS composite magnets.
基金This work was funded by the Deanship of Scientific Research at Jouf University under grant No(DSR-2021-02-0379).
文摘Indoor localization methods can help many sectors,such as healthcare centers,smart homes,museums,warehouses,and retail malls,improve their service areas.As a result,it is crucial to look for low-cost methods that can provide exact localization in indoor locations.In this context,imagebased localization methods can play an important role in estimating both the position and the orientation of cameras regarding an object.Image-based localization faces many issues,such as image scale and rotation variance.Also,image-based localization’s accuracy and speed(latency)are two critical factors.This paper proposes an efficient 6-DoF deep-learning model for image-based localization.This model incorporates the channel attention module and the Scale PyramidModule(SPM).It not only enhances accuracy but also ensures the model’s real-time performance.In complex scenes,a channel attention module is employed to distinguish between the textures of the foregrounds and backgrounds.Our model adapted an SPM,a feature pyramid module for dealing with image scale and rotation variance issues.Furthermore,the proposed model employs two regressions(two fully connected layers),one for position and the other for orientation,which increases outcome accuracy.Experiments on standard indoor and outdoor datasets show that the proposed model has a significantly lower Mean Squared Error(MSE)for both position and orientation.On the indoor 7-Scenes dataset,the MSE for the position is reduced to 0.19 m and 6.25°for the orientation.Furthermore,on the outdoor Cambridge landmarks dataset,the MSE for the position is reduced to 0.63 m and 2.03°for the orientation.According to the findings,the proposed approach is superior and more successful than the baseline methods.
基金supported in part by the National Natural Science Fund for Outstanding Young Scholars of China (61922072)the National Natural Science Foundation of China (62176238, 61806179, 61876169, 61976237)+2 种基金China Postdoctoral Science Foundation (2020M682347)the Training Program of Young Backbone Teachers in Colleges and Universities in Henan Province (2020GGJS006)Henan Provincial Young Talents Lifting Project (2021HYTP007)。
文摘Constrained multi-objective optimization problems(CMOPs) include the optimization of objective functions and the satisfaction of constraint conditions, which challenge the solvers.To solve CMOPs, constrained multi-objective evolutionary algorithms(CMOEAs) have been developed. However, most of them tend to converge into local areas due to the loss of diversity. Evolutionary multitasking(EMT) is new model of solving complex optimization problems, through the knowledge transfer between the source task and other related tasks. Inspired by EMT, this paper develops a new EMT-based CMOEA to solve CMOPs, in which the main task, a global auxiliary task, and a local auxiliary task are created and optimized by one specific population respectively. The main task focuses on finding the feasible Pareto front(PF), and global and local auxiliary tasks are used to respectively enhance global and local diversity. Moreover, the global auxiliary task is used to implement the global search by ignoring constraints, so as to help the population of the main task pass through infeasible obstacles. The local auxiliary task is used to provide local diversity around the population of the main task, so as to exploit promising regions. Through the knowledge transfer among the three tasks, the search ability of the population of the main task will be significantly improved. Compared with other state-of-the-art CMOEAs, the experimental results on three benchmark test suites demonstrate the superior or competitive performance of the proposed CMOEA.
基金This work was supported,in part,by the National Nature Science Foundation of China under Grant Numbers 61502240,61502096,61304205,61773219in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401+1 种基金in part,by the Postgraduate Research&Practice Innovation Program of Jiangsu Province under Grant Numbers SJCX21_0363in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of intelligent transportation system.Most existing vehicle re-identification models adopt the joint learning of global and local features.However,they directly use the extracted global features,resulting in insufficient feature expression.Moreover,local features are primarily obtained through advanced annotation and complex attention mechanisms,which require additional costs.To solve this issue,a multi-feature learning model with enhanced local attention for vehicle re-identification(MFELA)is proposed in this paper.The model consists of global and local branches.The global branch utilizes both middle and highlevel semantic features of ResNet50 to enhance the global representation capability.In addition,multi-scale pooling operations are used to obtain multiscale information.While the local branch utilizes the proposed Region Batch Dropblock(RBD),which encourages the model to learn discriminative features for different local regions and simultaneously drops corresponding same areas randomly in a batch during training to enhance the attention to local regions.Then features from both branches are combined to provide a more comprehensive and distinctive feature representation.Extensive experiments on VeRi-776 and VehicleID datasets prove that our method has excellent performance.
基金the National Natural Science Foundation of China(61971296,U19A2078,61836011,61801315)the Ministry of Education and China Mobile Research Foundation Project(MCM20180405)Sichuan Science and Technology Planning Project(2019YFG0495,2021YFG0301,2021YFG0317,2020YFG0319,2020YFH0186)。
文摘In this paper,we develop a novel global-attentionbased neural network(GANN)for vision language intelligence,specifically,image captioning(language description of a given image).As many previous works,the encoder-decoder framework is adopted in our proposed model,in which the encoder is responsible for encoding the region proposal features and extracting global caption feature based on a specially designed module of predicting the caption objects,and the decoder generates captions by taking the obtained global caption feature along with the encoded visual features as inputs for each attention head of the decoder layer.The global caption feature is introduced for the purpose of exploring the latent contributions of region proposals for image captioning,and further helping the decoder better focus on the most relevant proposals so as to extract more accurate visual feature in each time step of caption generation.Our GANN is implemented by incorporating the global caption feature into the attention weight calculation phase in the word predication process in each head of the decoder layer.In our experiments,we qualitatively analyzed the proposed model,and quantitatively evaluated several state-of-the-art schemes with GANN on the MS-COCO dataset.Experimental results demonstrate the effectiveness of the proposed global attention mechanism for image captioning.
基金Project supported by the Natural Science Basic Research Program of Shaanxi Province of China (Grant No. 2022JQ675)the Youth Innovation Team of Shaanxi Universities。
文摘Identifying influential nodes in complex networks and ranking their importance plays an important role in many fields such as public opinion analysis, marketing, epidemic prevention and control. To solve the issue of the existing node centrality measure only considering the specific statistical feature of a single dimension, a SLGC model is proposed that combines a node’s self-influence, its local neighborhood influence, and global influence to identify influential nodes in the network. The exponential function of e is introduced to measure the node’s self-influence;in the local neighborhood,the node’s one-hop neighboring nodes and two-hop neighboring nodes are considered, while the information entropy is introduced to measure the node’s local influence;the topological position of the node in the network and the shortest path between nodes are considered to measure the node’s global influence. To demonstrate the effectiveness of the proposed model, extensive comparison experiments are conducted with eight existing node centrality measures on six real network data sets using node differentiation ability experiments, susceptible–infected–recovered(SIR) model and network efficiency as evaluation criteria. The experimental results show that the method can identify influential nodes in complex networks more accurately.
基金National Natural Science Foundation of China (No.62006039)Shanghai Special Fund for Software and Integrated Circuit Industry Development,China (No.180330)。
文摘Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing clothing parsing algorithms, this paper proposes an enhanced positional attention module(EPAM) to collect positional information in the vertical direction of each pixel, and an efficient global prior module(GPM) to aggregate contextual information from different sub-regions. The EPAM and GPM based residual network(EG-ResNet) could effectively exploit the intrinsic features of clothing images while capturing information between different scales and sub-regions. Experimental results show that the proposed EG-ResNet achieves promising performance in clothing parsing of the colorful fashion parsing dataset(CFPD)(51.12% of mean Intersection over Union(mIoU) and 92.79% of pixel-wise accuracy(PA)) compared with other state-of-the-art methods.
基金Project(2023YFC2907204)supported by the National Key Research and Development Program of ChinaProject(52325905)supported by the National Natural Science Foundation of ChinaProject(DJ-HXGG-2023-16)supported by the Key Technology Research Projects of Power China。
文摘Underground energy and resource development,deep underground energy storage and other projects involve the global stability of multiple interconnected cavern groups under internal and external dynamic disturbances.An evaluation method of the global stability coefficient of underground caverns based on static overload and dynamic overload was proposed.Firstly,the global failure criterion for caverns was defined based on its band connection of plastic-strain between multi-caverns.Then,overloading calculation of the boundary geostress and seismic intensity on the caverns model was carried out,and the critical unstable state of multi-caverns can be identified,if the plastic-strain band appeared between caverns during these overloading processes.Thus,the global stability coefficient for the multi-caverns under static loading and earthquake was obtained based on the corresponding overloading coefficient.Practical analysis for the Yingliangbao(YLB)hydraulic caverns indicated that this method can not only effectively obtain the global stability coefficient of caverns under static and dynamic earthquake conditions,but also identify the caverns’high-risk zone of local instability through localized plastic strain of surrounding rock.This study can provide some reference for the layout design and seismic optimization of underground cavern group.
基金This study was funded by the Science and Technology Project in Xi’an(No.22GXFW0123)this work was supported by the Special Fund Construction Project of Key Disciplines in Ordinary Colleges and Universities in Shaanxi Province,the authors would like to thank the anonymous reviewers for their helpful comments and suggestions.
文摘As image manipulation technology advances rapidly,the malicious use of image tampering has alarmingly escalated,posing a significant threat to social stability.In the realm of image tampering localization,accurately localizing limited samples,multiple types,and various sizes of regions remains a multitude of challenges.These issues impede the model’s universality and generalization capability and detrimentally affect its performance.To tackle these issues,we propose FL-MobileViT-an improved MobileViT model devised for image tampering localization.Our proposed model utilizes a dual-stream architecture that independently processes the RGB and noise domain,and captures richer traces of tampering through dual-stream integration.Meanwhile,the model incorporating the Focused Linear Attention mechanism within the lightweight network(MobileViT).This substitution significantly diminishes computational complexity and resolves homogeneity problems associated with traditional Transformer attention mechanisms,enhancing feature extraction diversity and improving the model’s localization performance.To comprehensively fuse the generated results from both feature extractors,we introduce the ASPP architecture for multi-scale feature fusion.This facilitates a more precise localization of tampered regions of various sizes.Furthermore,to bolster the model’s generalization ability,we adopt a contrastive learning method and devise a joint optimization training strategy that leverages fused features and captures the disparities in feature distribution in tampered images.This strategy enables the learning of contrastive loss at various stages of the feature extractor and employs it as an additional constraint condition in conjunction with cross-entropy loss.As a result,overfitting issues are effectively alleviated,and the differentiation between tampered and untampered regions is enhanced.Experimental evaluations on five benchmark datasets(IMD-20,CASIA,NIST-16,Columbia and Coverage)validate the effectiveness of our proposed model.The meticulously calibrated FL-MobileViT model consistently outperforms numerous existing general models regarding localization accuracy across diverse datasets,demonstrating superior adaptability.
文摘The 13th China International Diecasting Congress & Exhibition(CHINA DIECASTING 2018) and 2018 China NonferrousAlloys & Special Casting Exhibition (CHINA NONFERROUS2018) were held concurrently July 18-20 at Shanghai NewInternational Expo Centre. The events were co-organized bythe Foundry Institution of Chinese Mechanical EngineeringSociety (FICMES), Shenyang Zhongzhu Foundry ProductivityPromotion Center Co., Ltd. (FPC) and the State Key Laboratoryof Light Alloys Casting Technologies for High-end Equipment.
基金supported by the National Natural Science Foundation of China(61877067)the Foundation of Science and Technology on Near-Surface Detection Laboratory(TCGZ2019A002,TCGZ2021C003,6142414200511)the Natural Science Basic Research Program of Shaanxi(2021JZ-19)。
文摘Acoustic source localization(ASL)and sound event detection(SED)are two widely pursued independent research fields.In recent years,in order to achieve a more complete spatial and temporal representation of sound field,sound event localization and detection(SELD)has become a very active research topic.This paper presents a deep learning-based multioverlapping sound event localization and detection algorithm in three-dimensional space.Log-Mel spectrum and generalized cross-correlation spectrum are joined together in channel dimension as input features.These features are classified and regressed in parallel after training by a neural network to obtain sound recognition and localization results respectively.The channel attention mechanism is also introduced in the network to selectively enhance the features containing essential information and suppress the useless features.Finally,a thourough comparison confirms the efficiency and effectiveness of the proposed SELD algorithm.Field experiments show that the proposed algorithm is robust to reverberation and environment and can achieve higher recognition and localization accuracy compared with the baseline method.
文摘Although there is no consensus with respect to that if exposed Extremely Low Frequency Magnetic Field (ELF-MF) affects human brain activity for guidelines of brain management, there are some evidences related with human attention changes. Therefore, this study evaluates the effects of 45 Hz sinusoidal ELF (360 μT) at Cz regions, cantered at dominant frequency using Electroencephalogram (EEG) analysis. The purpose was to extracte transient or permanent events as an index for new neurofeedback (NF) system improvement. Twenty-four healthy volunteers aged between 20 and 28 years of age were randomly assigned to one of two groups, which differed in the type of NF training concerning the exposed and non-exposed magnetic field effect on performance in attention tests during NF. Results indicate that theta and beta EEG rhythms variations in exposed group changed more significantly in comparison of traditional NF