期刊文献+
共找到3,565篇文章
< 1 2 179 >
每页显示 20 50 100
Image Inpainting Technique Incorporating Edge Prior and Attention Mechanism
1
作者 Jinxian Bai Yao Fan +1 位作者 Zhiwei Zhao Lizhi Zheng 《Computers, Materials & Continua》 SCIE EI 2024年第1期999-1025,共27页
Recently,deep learning-based image inpainting methods have made great strides in reconstructing damaged regions.However,these methods often struggle to produce satisfactory results when dealing with missing images wit... Recently,deep learning-based image inpainting methods have made great strides in reconstructing damaged regions.However,these methods often struggle to produce satisfactory results when dealing with missing images with large holes,leading to distortions in the structure and blurring of textures.To address these problems,we combine the advantages of transformers and convolutions to propose an image inpainting method that incorporates edge priors and attention mechanisms.The proposed method aims to improve the results of inpainting large holes in images by enhancing the accuracy of structure restoration and the ability to recover texture details.This method divides the inpainting task into two phases:edge prediction and image inpainting.Specifically,in the edge prediction phase,a transformer architecture is designed to combine axial attention with standard self-attention.This design enhances the extraction capability of global structural features and location awareness.It also balances the complexity of self-attention operations,resulting in accurate prediction of the edge structure in the defective region.In the image inpainting phase,a multi-scale fusion attention module is introduced.This module makes full use of multi-level distant features and enhances local pixel continuity,thereby significantly improving the quality of image inpainting.To evaluate the performance of our method.comparative experiments are conducted on several datasets,including CelebA,Places2,and Facade.Quantitative experiments show that our method outperforms the other mainstream methods.Specifically,it improves Peak Signal-to-Noise Ratio(PSNR)and Structure Similarity Index Measure(SSIM)by 1.141~3.234 db and 0.083~0.235,respectively.Moreover,it reduces Learning Perceptual Image Patch Similarity(LPIPS)and Mean Absolute Error(MAE)by 0.0347~0.1753 and 0.0104~0.0402,respectively.Qualitative experiments reveal that our method excels at reconstructing images with complete structural information and clear texture details.Furthermore,our model exhibits impressive performance in terms of the number of parameters,memory cost,and testing time. 展开更多
关键词 image inpainting TRANSFORMER edge prior axial attention multi-scale fusion attention
下载PDF
Image Hiding with High Robustness Based on Dynamic Region Attention in the Wavelet Domain
2
作者 Zengxiang Li Yongchong Wu +3 位作者 Alanoud Al Mazroa Donghua Jiang Jianhua Wu Xishun Zhu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期847-869,共23页
Hidden capacity,concealment,security,and robustness are essential indicators of hiding algorithms.Currently,hiding algorithms tend to focus on algorithmic capacity,concealment,and security but often overlook the robus... Hidden capacity,concealment,security,and robustness are essential indicators of hiding algorithms.Currently,hiding algorithms tend to focus on algorithmic capacity,concealment,and security but often overlook the robustness of the algorithms.In practical applications,the container can suffer from damage caused by noise,cropping,and other attacks during transmission,resulting in challenging or even impossible complete recovery of the secret image.An image hiding algorithm based on dynamic region attention in the multi-scale wavelet domain is proposed to address this issue and enhance the robustness of hiding algorithms.In this proposed algorithm,a secret image of size 256×256 is first decomposed using an eight-level Haar wavelet transform.The wavelet transform generates one coefficient in the approximation component and twenty-four detail bands,which are then embedded into the carrier image via a hiding network.During the recovery process,the container image is divided into four non-overlapping parts,each employed to reconstruct a low-resolution secret image.These lowresolution secret images are combined using densemodules to obtain a high-quality secret image.The experimental results showed that even under destructive attacks on the container image,the proposed algorithm is successful in recovering a high-quality secret image,indicating that the algorithm exhibits a high degree of robustness against various attacks.The proposed algorithm effectively addresses the robustness issue by incorporating both spatial and channel attention mechanisms in the multi-scale wavelet domain,making it suitable for practical applications.In conclusion,the image hiding algorithm introduced in this study offers significant improvements in robustness compared to existing algorithms.Its ability to recover high-quality secret images even in the presence of destructive attacksmakes it an attractive option for various applications.Further research and experimentation can explore the algorithm’s performance under different scenarios and expand its potential applications. 展开更多
关键词 image hiding ROBUSTNESS wavelet transform dynamic region attention
下载PDF
CMMCAN:Lightweight Feature Extraction and Matching Network for Endoscopic Images Based on Adaptive Attention
3
作者 Nannan Chong Fan Yang 《Computers, Materials & Continua》 SCIE EI 2024年第8期2761-2783,共23页
In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clini... In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clinical operating environments,endoscopic images often suffer from challenges such as low texture,uneven illumination,and non-rigid structures,which affect feature observation and extraction.This can severely impact surgical navigation or clinical diagnosis due to missing feature points in endoscopic images,leading to treatment and postoperative recovery issues for patients.To address these challenges,this paper introduces,for the first time,a Cross-Channel Multi-Modal Adaptive Spatial Feature Fusion(ASFF)module based on the lightweight architecture of EfficientViT.Additionally,a novel lightweight feature extraction and matching network based on attention mechanism is proposed.This network dynamically adjusts attention weights for cross-modal information from grayscale images and optical flow images through a dual-branch Siamese network.It extracts static and dynamic information features ranging from low-level to high-level,and from local to global,ensuring robust feature extraction across different widths,noise levels,and blur scenarios.Global and local matching are performed through a multi-level cascaded attention mechanism,with cross-channel attention introduced to simultaneously extract low-level and high-level features.Extensive ablation experiments and comparative studies are conducted on the HyperKvasir,EAD,M2caiSeg,CVC-ClinicDB,and UCL synthetic datasets.Experimental results demonstrate that the proposed network improves upon the baseline EfficientViT-B3 model by 75.4%in accuracy(Acc),while also enhancing runtime performance and storage efficiency.When compared with the complex DenseDescriptor feature extraction network,the difference in Acc is less than 7.22%,and IoU calculation results on specific datasets outperform complex dense models.Furthermore,this method increases the F1 score by 33.2%and accelerates runtime by 70.2%.It is noteworthy that the speed of CMMCAN surpasses that of comparative lightweight models,with feature extraction and matching performance comparable to existing complex models but with faster speed and higher cost-effectiveness. 展开更多
关键词 Feature extraction and matching lightweighted network medical images ENDOSCOPIC attention
下载PDF
An Image Fingerprint and Attention Mechanism Based Load Estimation Algorithm for Electric Power System
4
作者 Qing Zhu Linlin Gu Huijie Lin 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期577-591,共15页
With the rapid development of electric power systems,load estimation plays an important role in system operation and planning.Usually,load estimation techniques contain traditional,time series,regression analysis-base... With the rapid development of electric power systems,load estimation plays an important role in system operation and planning.Usually,load estimation techniques contain traditional,time series,regression analysis-based,and machine learning-based estimation.Since the machine learning-based method can lead to better performance,in this paper,a deep learning-based load estimation algorithm using image fingerprint and attention mechanism is proposed.First,an image fingerprint construction is proposed for training data.After the data preprocessing,the training data matrix is constructed by the cyclic shift and cubic spline interpolation.Then,the linear mapping and the gray-color transformation method are proposed to form the color image fingerprint.Second,a convolutional neural network(CNN)combined with an attentionmechanism is proposed for training performance improvement.At last,an experiment is carried out to evaluate the estimation performance.Compared with the support vector machine method,CNN method and long short-term memory method,the proposed algorithm has the best load estimation performance. 展开更多
关键词 Load estimation deep learning attention mechanism image fingerprint construction
下载PDF
A Cover-Independent Deep Image Hiding Method Based on Domain Attention Mechanism
5
作者 Nannan Wu Xianyi Chen +1 位作者 James Msughter Adeke Junjie Zhao 《Computers, Materials & Continua》 SCIE EI 2024年第3期3001-3019,共19页
Recently,deep image-hiding techniques have attracted considerable attention in covert communication and high-capacity information hiding.However,these approaches have some limitations.For example,a cover image lacks s... Recently,deep image-hiding techniques have attracted considerable attention in covert communication and high-capacity information hiding.However,these approaches have some limitations.For example,a cover image lacks self-adaptability,information leakage,or weak concealment.To address these issues,this study proposes a universal and adaptable image-hiding method.First,a domain attention mechanism is designed by combining the Atrous convolution,which makes better use of the relationship between the secret image domain and the cover image domain.Second,to improve perceived human similarity,perceptual loss is incorporated into the training process.The experimental results are promising,with the proposed method achieving an average pixel discrepancy(APD)of 1.83 and a peak signal-to-noise ratio(PSNR)value of 40.72 dB between the cover and stego images,indicative of its high-quality output.Furthermore,the structural similarity index measure(SSIM)reaches 0.985 while the learned perceptual image patch similarity(LPIPS)remarkably registers at 0.0001.Moreover,self-testing and cross-experiments demonstrate the model’s adaptability and generalization in unknown hidden spaces,making it suitable for diverse computer vision tasks. 展开更多
关键词 Deep image hiding attention mechanism privacy protection data security visual quality
下载PDF
Efficient Unsupervised Image Stitching Using Attention Mechanism with Deep Homography Estimation
6
作者 Chunbin Qin Xiaotian Ran 《Computers, Materials & Continua》 SCIE EI 2024年第4期1319-1334,共16页
Traditional feature-based image stitching techniques often encounter obstacles when dealing with images lackingunique attributes or suffering from quality degradation. The scarcity of annotated datasets in real-life s... Traditional feature-based image stitching techniques often encounter obstacles when dealing with images lackingunique attributes or suffering from quality degradation. The scarcity of annotated datasets in real-life scenesseverely undermines the reliability of supervised learning methods in image stitching. Furthermore, existing deeplearning architectures designed for image stitching are often too bulky to be deployed on mobile and peripheralcomputing devices. To address these challenges, this study proposes a novel unsupervised image stitching methodbased on the YOLOv8 (You Only Look Once version 8) framework that introduces deep homography networksand attentionmechanisms. Themethodology is partitioned into three distinct stages. The initial stage combines theattention mechanism with a pooling pyramid model to enhance the detection and recognition of compact objectsin images, the task of the deep homography networks module is to estimate the global homography of the inputimages consideringmultiple viewpoints. The second stage involves preliminary stitching of the masks generated inthe initial stage and further enhancement through weighted computation to eliminate common stitching artifacts.The final stage is characterized by adaptive reconstruction and careful refinement of the initial stitching results.Comprehensive experiments acrossmultiple datasets are executed tometiculously assess the proposed model. Ourmethod’s Peak Signal-to-Noise Ratio (PSNR) and Structure Similarity Index Measure (SSIM) improved by 10.6%and 6%. These experimental results confirm the efficacy and utility of the presented model in this paper. 展开更多
关键词 Unsupervised image stitching deep homography estimation YOLOv8 attention mechanism
下载PDF
A Tabletop Nano-CT Image Noise Reduction Network Based on 3-Dimensional Axial Attention Mechanism
7
作者 Huijuan Fu Linlin Zhu +5 位作者 ChunhuiWang Xiaoqi Xi Yu Han Lei Li Yanmin Sun Bin Yan 《Computers, Materials & Continua》 SCIE EI 2024年第7期1711-1725,共15页
Nano-computed tomography(Nano-CT)is an emerging,high-resolution imaging technique.However,due to their low-light properties,tabletop Nano-CT has to be scanned under long exposure conditions,which the scanning process ... Nano-computed tomography(Nano-CT)is an emerging,high-resolution imaging technique.However,due to their low-light properties,tabletop Nano-CT has to be scanned under long exposure conditions,which the scanning process is time-consuming.For 3D reconstruction data,this paper proposed a lightweight 3D noise reduction method for desktop-level Nano-CT called AAD-ResNet(Axial Attention DeNoise ResNet).The network is framed by theU-net structure.The encoder and decoder are incorporated with the proposed 3D axial attention mechanism and residual dense block.Each layer of the residual dense block can directly access the features of the previous layer,which reduces the redundancy of parameters and improves the efficiency of network training.The 3D axial attention mechanism enhances the correlation between 3D information in the training process and captures the long-distance dependence.It can improve the noise reduction effect and avoid the loss of image structure details.Experimental results show that the network can effectively improve the image quality of a 0.1-s exposure scan to a level close to a 3-s exposure,significantly shortening the sample scanning time. 展开更多
关键词 Deep learning tabletop Nano-CT image denoising 3D axial attention mechanism
下载PDF
Unsupervised multi-modal image translation based on the squeeze-and-excitation mechanism and feature attention module
8
作者 胡振涛 HU Chonghao +1 位作者 YANG Haoran SHUAI Weiwei 《High Technology Letters》 EI CAS 2024年第1期23-30,共8页
The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-genera... The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable. 展开更多
关键词 multi-modal image translation generative adversarial network(GAN) squeezeand-excitation(SE)mechanism feature attention(FA)module
下载PDF
Pervasive Attentive Neural Network for Intelligent Image Classification Based on N-CDE’s
9
作者 Anas W.Abulfaraj 《Computers, Materials & Continua》 SCIE EI 2024年第4期1137-1156,共20页
The utilization of visual attention enhances the performance of image classification tasks.Previous attentionbased models have demonstrated notable performance,but many of these models exhibit reduced accuracy when co... The utilization of visual attention enhances the performance of image classification tasks.Previous attentionbased models have demonstrated notable performance,but many of these models exhibit reduced accuracy when confronted with inter-class and intra-class similarities and differences.Neural-Controlled Differential Equations(N-CDE’s)and Neural Ordinary Differential Equations(NODE’s)are extensively utilized within this context.NCDE’s possesses the capacity to effectively illustrate both inter-class and intra-class similarities and differences with enhanced clarity.To this end,an attentive neural network has been proposed to generate attention maps,which uses two different types of N-CDE’s,one for adopting hidden layers and the other to generate attention values.Two distinct attention techniques are implemented including time-wise attention,also referred to as bottom N-CDE’s;and element-wise attention,called topN-CDE’s.Additionally,a trainingmethodology is proposed to guarantee that the training problem is sufficiently presented.Two classification tasks including fine-grained visual classification andmulti-label classification,are utilized to evaluate the proposedmodel.The proposedmethodology is employed on five publicly available datasets,including CUB-200-2011,ImageNet-1K,PASCAL VOC 2007,PASCAL VOC 2012,and MS COCO.The obtained visualizations have demonstrated that N-CDE’s are better appropriate for attention-based activities in comparison to conventional NODE’s. 展开更多
关键词 Differential equations neural-controlled DE image classification attention maps N-CDE’s
下载PDF
Multi-scale attention encoder for street-to-aerial image geo-localization 被引量:2
10
作者 Songlian Li Zhigang Tu +1 位作者 Yujin Chen Tan Yu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第1期166-176,共11页
The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance g... The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance gap between the aerial-view and the street-view images brings a huge challenge against this task.In this paper,we propose a novel multiscale attention encoder to capture the multiscale contextual information of the aerial/street-view images.To bridge the domain gap between these two view images,we first use an inverse polar transform to make the street-view images approximately aligned with the aerial-view images.Then,the explored multiscale attention encoder is applied to convert the image into feature representation with the guidance of the learnt multiscale information.Finally,we propose a novel global mining strategy to enable the network to pay more attention to hard negative exemplars.Experiments on standard benchmark datasets show that our approach obtains 81.39%top-1 recall rate on the CVUSA dataset and 71.52%on the CVACT dataset,achieving the state-of-the-art performance and outperforming most of the existing methods significantly. 展开更多
关键词 global mining strategy image geo-localization multiscale attention encoder street-to-aerial cross-view
下载PDF
VLCA: vision-language aligning model with cross-modal attention for bilingual remote sensing image captioning 被引量:1
11
作者 WEI Tingting YUAN Weilin +2 位作者 LUO Junren ZHANG Wanpeng LU Lina 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2023年第1期9-18,共10页
In the field of satellite imagery, remote sensing image captioning(RSIC) is a hot topic with the challenge of overfitting and difficulty of image and text alignment. To address these issues, this paper proposes a visi... In the field of satellite imagery, remote sensing image captioning(RSIC) is a hot topic with the challenge of overfitting and difficulty of image and text alignment. To address these issues, this paper proposes a vision-language aligning paradigm for RSIC to jointly represent vision and language. First, a new RSIC dataset DIOR-Captions is built for augmenting object detection in optical remote(DIOR) sensing images dataset with manually annotated Chinese and English contents. Second, a Vision-Language aligning model with Cross-modal Attention(VLCA) is presented to generate accurate and abundant bilingual descriptions for remote sensing images. Third, a crossmodal learning network is introduced to address the problem of visual-lingual alignment. Notably, VLCA is also applied to end-toend Chinese captions generation by using the pre-training language model of Chinese. The experiments are carried out with various baselines to validate VLCA on the proposed dataset. The results demonstrate that the proposed algorithm is more descriptive and informative than existing algorithms in producing captions. 展开更多
关键词 remote sensing image captioning(RSIC) vision-language representation remote sensing image caption dataset attention mechanism
下载PDF
Single Image Deraining Using Dual Branch Network Based on Attention Mechanism for IoT 被引量:1
12
作者 Di Wang Bingcai Wei Liye Zhang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第11期1989-2000,共12页
Extracting useful details from images is essential for the Internet of Things project.However,in real life,various external environments,such as badweather conditions,will cause the occlusion of key target information... Extracting useful details from images is essential for the Internet of Things project.However,in real life,various external environments,such as badweather conditions,will cause the occlusion of key target information and image distortion,resulting in difficulties and obstacles to the extraction of key information,affecting the judgment of the real situation in the process of the Internet of Things,and causing system decision-making errors and accidents.In this paper,we mainly solve the problem of rain on the image occlusion,remove the rain grain in the image,and get a clear image without rain.Therefore,the single image deraining algorithm is studied,and a dual-branch network structure based on the attention module and convolutional neural network(CNN)module is proposed to accomplish the task of rain removal.In order to complete the rain removal of a single image with high quality,we apply the spatial attention module,channel attention module and CNN module to the network structure,and build the network using the coder-decoder structure.In the experiment,with the structural similarity(SSIM)and the peak signal-to-noise ratio(PSNR)as evaluation indexes,the training and testing results on the rain removal dataset show that the proposed structure has a good effect on the single image deraining task. 展开更多
关键词 Internet of Things image deraining dual-branch network structure attention module convolutional neural network
下载PDF
Deep Attention Network for Pneumonia Detection Using Chest X-Ray Images
13
作者 Sukhendra Singh Sur Singh Rawat +5 位作者 Manoj Gupta B.K.Tripathi Faisal Alanzi Arnab Majumdar Pattaraporn Khuwuthyakorn Orawit Thinnukool 《Computers, Materials & Continua》 SCIE EI 2023年第1期1673-1691,共19页
In computer vision,object recognition and image categorization have proven to be difficult challenges.They have,nevertheless,generated responses to a wide range of difficult issues from a variety of fields.Convolution... In computer vision,object recognition and image categorization have proven to be difficult challenges.They have,nevertheless,generated responses to a wide range of difficult issues from a variety of fields.Convolution Neural Networks(CNNs)have recently been identified as the most widely proposed deep learning(DL)algorithms in the literature.CNNs have unquestionably delivered cutting-edge achievements,particularly in the areas of image classification,speech recognition,and video processing.However,it has been noticed that the CNN-training assignment demands a large amount of data,which is in low supply,especially in the medical industry,and as a result,the training process takes longer.In this paper,we describe an attentionaware CNN architecture for classifying chest X-ray images to diagnose Pneumonia in order to address the aforementioned difficulties.AttentionModules provide attention-aware properties to the Attention Network.The attentionaware features of various modules alter as the layers become deeper.Using a bottom-up top-down feedforward structure,the feedforward and feedback attention processes are integrated into a single feedforward process inside each attention module.In the present work,a deep neural network(DNN)is combined with an attention mechanism to test the prediction of Pneumonia disease using chest X-ray pictures.To produce attention-aware features,the suggested networkwas built by merging channel and spatial attentionmodules in DNN architecture.With this network,we worked on a publicly available Kaggle chest X-ray dataset.Extensive testing was carried out to validate the suggested model.In the experimental results,we attained an accuracy of 95.47%and an F-score of 0.92,indicating that the suggested model outperformed against the baseline models. 展开更多
关键词 attention network image classification object detection residual networks deep neural network
下载PDF
Short‐term and long‐term memory self‐attention network for segmentation of tumours in 3D medical images
14
作者 Mingwei Wen Quan Zhou +3 位作者 Bo Tao Pavel Shcherbakov Yang Xu Xuming Zhang 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第4期1524-1537,共14页
Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shap... Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shapes and sizes.The popular deep learning‐based segmentation algorithms generally rely on the convolutional neural network(CNN)and Transformer.The former cannot extract the global image features effectively while the latter lacks the inductive bias and involves the complicated computation for 3D volume data.The existing hybrid CNN‐Transformer network can only provide the limited performance improvement or even poorer segmentation performance than the pure CNN.To address these issues,a short‐term and long‐term memory self‐attention network is proposed.Firstly,a distinctive self‐attention block uses the Transformer to explore the correlation among the region features at different levels extracted by the CNN.Then,the memory structure filters and combines the above information to exclude the similar regions and detect the multiple tumours.Finally,the multi‐layer reconstruction blocks will predict the tumour boundaries.Experimental results demonstrate that our method outperforms other methods in terms of subjective visual and quantitative evaluation.Compared with the most competitive method,the proposed method provides Dice(82.4%vs.76.6%)and Hausdorff distance 95%(HD95)(10.66 vs.11.54 mm)on the KiTS19 as well as Dice(80.2%vs.78.4%)and HD95(9.632 vs.12.17 mm)on the LiTS. 展开更多
关键词 3D medical images convolutional neural network self‐attention network TRANSFORMER tumor segmentation
下载PDF
引入上下文信息和Attention Gate的GUS-YOLO遥感目标检测算法 被引量:8
15
作者 张华卫 张文飞 +2 位作者 蒋占军 廉敬 吴佰靖 《计算机科学与探索》 CSCD 北大核心 2024年第2期453-464,共12页
目前基于通用YOLO系列的遥感目标检测算法存在并未充分利用图像的全局上下文信息,在特征融合金字塔部分并未充分考虑缩小融合特征之间的语义鸿沟、抑制冗余信息干扰的缺点。在结合YOLO算法优点的基础上提出GUS-YOLO算法,其拥有一个能够... 目前基于通用YOLO系列的遥感目标检测算法存在并未充分利用图像的全局上下文信息,在特征融合金字塔部分并未充分考虑缩小融合特征之间的语义鸿沟、抑制冗余信息干扰的缺点。在结合YOLO算法优点的基础上提出GUS-YOLO算法,其拥有一个能够充分利用全局上下文信息的骨干网络Global Backbone。除此之外,该算法在融合特征金字塔自顶向下的结构中引入Attention Gate模块,可以突出必要的特征信息,抑制冗余信息。另外,为Attention Gate模块设计了最佳的网络结构,提出了网络的特征融合结构U-Net。最后,为克服ReLU函数可能导致模型梯度不再更新的问题,该算法将Attention Gate模块的激活函数升级为可学习的SMU激活函数,提高模型鲁棒性。在NWPU VHR-10遥感数据集上,该算法相较于YOLOV7算法取得宽松指标mAP^(0.50)1.64个百分点和严格指标mAP^(0.75)9.39个百分点的性能提升。相较于目前主流的七种检测算法,该算法取得较好的检测性能。 展开更多
关键词 遥感图像 Global Backbone attention Gate SMU U-neck
下载PDF
Simplified Inception Module Based Hadamard Attention Mechanism for Medical Image Classification
16
作者 Yanlin Jin Zhiming You Ningyin Cai 《Journal of Computer and Communications》 2023年第6期1-18,共18页
Medical image classification has played an important role in the medical field, and the related method based on deep learning has become an important and powerful technique in medical image classification. In this art... Medical image classification has played an important role in the medical field, and the related method based on deep learning has become an important and powerful technique in medical image classification. In this article, we propose a simplified inception module based Hadamard attention (SI + HA) mechanism for medical image classification. Specifically, we propose a new attention mechanism: Hadamard attention mechanism. It improves the accuracy of medical image classification without greatly increasing the complexity of the model. Meanwhile, we adopt a simplified inception module to improve the utilization of parameters. We use two medical image datasets to prove the superiority of our proposed method. In the BreakHis dataset, the AUCs of our method can reach 98.74%, 98.38%, 98.61% and 97.67% under the magnification factors of 40×, 100×, 200× and 400×, respectively. The accuracies can reach 95.67%, 94.17%, 94.53% and 94.12% under the magnification factors of 40×, 100×, 200× and 400×, respectively. In the KIMIA Path 960 dataset, the AUCs and accuracy of our method can reach 99.91% and 99.03%. It is superior to the currently popular methods and can significantly improve the effectiveness of medical image classification. 展开更多
关键词 Deep Learning Medical image Classification attention Mechanism Inception Module
下载PDF
基于Unet+Attention的胸部CT影像支气管分割算法
17
作者 张子明 周庆华 +1 位作者 薛洪省 覃文军 《中国生物医学工程学报》 CAS CSCD 北大核心 2024年第1期60-69,共10页
目前肺气管分割中,由于CT图像灰度分布复杂,分割目标像素近似,易造成过分割;而且肺气管像素较少,难以得到更多目标特征,造成细小肺气管容易被忽略。针对这些难点,本研究提出结合Unet网络和注意力机制的肺气管分割算法,注意力机制使用的... 目前肺气管分割中,由于CT图像灰度分布复杂,分割目标像素近似,易造成过分割;而且肺气管像素较少,难以得到更多目标特征,造成细小肺气管容易被忽略。针对这些难点,本研究提出结合Unet网络和注意力机制的肺气管分割算法,注意力机制使用的是关注通道域和空间域的卷积块注意力模型(CBAM),该模型提高了气管特征权重。在损失函数方面,针对原始数据中正负样本失衡的问题,引入focal loss损失函数,该函数对标准交叉熵损失函数进行了改进,使难分类样本在训练过程中得到更多关注;最后通过八连通域判断将孤立点去除,保留较大的几个连通域,即最后的肺气管部分。选用由合作医院提供的24组CT影像和43组CTA影像,共计26157张切片图像作为数据集,进行分割实验。结果表明,分割准确率能够达到0.86,过分割率和欠分割率均值为0.28和0.39。经过注意力模块和损失函数的消融实验,在改进前的准确率、过分割率和欠分割率分别为0.81、0.30、0.40,可见其分割效果均不如Unet+Attention方法。与其他常用方法在相同条件下进行比较后,在保证过分割率和欠分割率不变的情况下,所提出的算法得到了最高的准确率,较好地解决了细小气管分割不准确的问题。 展开更多
关键词 医学图像分割 肺气管 Unet 注意力机制 focal loss
下载PDF
引入卷积块注意力模块的Attention U-Net木材表面裂纹检测方法
18
作者 项晓扬 王明涛 多化琼 《林业工程学报》 CSCD 北大核心 2024年第4期140-146,共7页
木材缺陷会影响木材的使用价值和使用期限,其中木材表面裂纹是严重影响木材外观质量和机械强度的一种木材缺陷。对木材表面裂纹的检测可以尽快发现此类缺陷木材,或为后续处理提供依据。针对现有的人工检测和自动化检测木材表面裂纹效率... 木材缺陷会影响木材的使用价值和使用期限,其中木材表面裂纹是严重影响木材外观质量和机械强度的一种木材缺陷。对木材表面裂纹的检测可以尽快发现此类缺陷木材,或为后续处理提供依据。针对现有的人工检测和自动化检测木材表面裂纹效率低、成本高、漏检率高等问题,采用引入卷积块注意力模块(convolutional block attention module,CBAM)的Attention U-Net深度学习模型对木材表面裂纹图像进行语义分割,从而达到木材表面裂纹检测的目的。引入的CBAM模块包含通道注意力机制和空间注意力机制,分别用于捕捉通道间的依赖关系和像素级的空间关系,该模块被添加到Attention U-Net网络的编码阶段,以增加感兴趣区域的权重并抑制冗余信息。最后,通过消融试验验证了Attention U-Net中加入CBAM对分割性能的提升。采用像素准确率(PA)、类别像素准确率(CPA)、召回率(Recall)、Dice系数、交并比(IoU)和平均交并比(MIoU)等语义分割评价指标评价各模型的优劣,并确定最佳模型及其参数。在自制木材表面数据集的裂纹分割中,使用AdamW优化器引入CBAM的Attention U-Net的PA、木材裂纹Recall、木材裂纹Dice系数、木材裂纹IoU、MIoU分别比使用SGD优化器的Attention U-Net原始模型提高了0.11%,4.14%,2.96%,3.58%和1.84%。结果表明,使用AdamW优化器引入CBAM的Attention U-Net能够较好地分割背景和木材表面裂纹,区分节点、表面纹理和木材裂纹,并将节点和表面纹理分割为背景。 展开更多
关键词 图像处理 语义分割 木材表面裂纹检测 深度学习 U-Net模型 注意力机制
下载PDF
CAMU-Net:基于Attention U-Net的视网膜血管分割改进模型
19
作者 唐云飞 但志平 +4 位作者 洪郑天 陈永麟 程沛霖 成果 刘芳婷 《中国医学物理学杂志》 CSCD 2024年第8期960-968,共9页
提出一种改进的U-Net模型(CAMU-Net),以达到精准分割视网膜血管的目的。CAMU-Net模型通过添加残差增强模块来提取区域特征中的重要信息,增强模型对区域特征的了解;通过添加特征细化模块来促进特征的提取,提高新模型的全局特征收集能力;... 提出一种改进的U-Net模型(CAMU-Net),以达到精准分割视网膜血管的目的。CAMU-Net模型通过添加残差增强模块来提取区域特征中的重要信息,增强模型对区域特征的了解;通过添加特征细化模块来促进特征的提取,提高新模型的全局特征收集能力;通过添加通道注意力机制模块来捕捉图像特征,精确分割结果;通过引入多尺度特征融合结构来提升模型感知目标边界等细节的能力。在DRIVE数据集上进行消融实验,得出各模块的实际效果,验证各模块对于本模型视网膜血管分割各方面提升的作用;在DRIVE和STARE数据集上和其他主流网络模型进行对比分析,结果表明CAMU-Net模型优于其他模型。 展开更多
关键词 视网膜血管 图像分割 深度学习 CAMU-Net 注意力机制
下载PDF
Attention Guided Food Recognition via Multi-Stage Local Feature Fusion
20
作者 Gonghui Deng Dunzhi Wu Weizhen Chen 《Computers, Materials & Continua》 SCIE EI 2024年第8期1985-2003,共19页
The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregula... The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field. 展开更多
关键词 Fine-grained image recognition food image recognition attention mechanism local feature fusion
下载PDF
上一页 1 2 179 下一页 到第
使用帮助 返回顶部