期刊文献+
共找到2,797篇文章
< 1 2 140 >
每页显示 20 50 100
Bilateral U-Net semantic segmentation with spatial attention mechanism 被引量:2
1
作者 Guangzhe Zhao Yimeng Zhang +1 位作者 Maoning Ge Min Yu 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第2期297-307,共11页
Aiming at the problem that the existing models have a poor segmentation effect on imbalanced data sets with small-scale samples,a bilateral U-Net network model with a spatial attention mechanism is designed.The model ... Aiming at the problem that the existing models have a poor segmentation effect on imbalanced data sets with small-scale samples,a bilateral U-Net network model with a spatial attention mechanism is designed.The model uses the lightweight MobileNetV2 as the backbone network for feature hierarchical extraction and proposes an Attentive Pyramid Spatial Attention(APSA)module compared to the Attenuated Spatial Pyramid module,which can increase the receptive field and enhance the information,and finally adds the context fusion prediction branch that fuses high-semantic and low-semantic prediction results,and the model effectively improves the segmentation accuracy of small data sets.The experimental results on the CamVid data set show that compared with some existing semantic segmentation networks,the algorithm has a better segmentation effect and segmentation accuracy,and its mIOU reaches 75.85%.Moreover,to verify the generality of the model and the effectiveness of the APSA module,experiments were conducted on the VOC 2012 data set,and the APSA module improved mIOU by about 12.2%. 展开更多
关键词 attention mechanism receptive field semantic fusion semantic segmentation spatial attention module u-net
下载PDF
Semantic segmentation-based semantic communication system for image transmission
2
作者 Jiale Wu Celimuge Wu +4 位作者 Yangfei Lin Tsutomu Yoshinaga Lei Zhong Xianfu Chen Yusheng Ji 《Digital Communications and Networks》 SCIE CSCD 2024年第3期519-527,共9页
With the rapid development of artificial intelligence and the widespread use of the Internet of Things, semantic communication, as an emerging communication paradigm, has been attracting great interest. Taking image t... With the rapid development of artificial intelligence and the widespread use of the Internet of Things, semantic communication, as an emerging communication paradigm, has been attracting great interest. Taking image transmission as an example, from the semantic communication's view, not all pixels in the images are equally important for certain receivers. The existing semantic communication systems directly perform semantic encoding and decoding on the whole image, in which the region of interest cannot be identified. In this paper, we propose a novel semantic communication system for image transmission that can distinguish between Regions Of Interest (ROI) and Regions Of Non-Interest (RONI) based on semantic segmentation, where a semantic segmentation algorithm is used to classify each pixel of the image and distinguish ROI and RONI. The system also enables high-quality transmission of ROI with lower communication overheads by transmissions through different semantic communication networks with different bandwidth requirements. An improved metric θPSNR is proposed to evaluate the transmission accuracy of the novel semantic transmission network. Experimental results show that our proposed system achieves a significant performance improvement compared with existing approaches, namely, existing semantic communication approaches and the conventional approach without semantics. 展开更多
关键词 semantic Communication semantic segmentation Image transmission Image compression Deep learning
下载PDF
CrossFormer Embedding DeepLabv3+ for Remote Sensing Images Semantic Segmentation
3
作者 Qixiang Tong Zhipeng Zhu +2 位作者 Min Zhang Kerui Cao Haihua Xing 《Computers, Materials & Continua》 SCIE EI 2024年第4期1353-1375,共23页
High-resolution remote sensing image segmentation is a challenging task. In urban remote sensing, the presenceof occlusions and shadows often results in blurred or invisible object boundaries, thereby increasing the d... High-resolution remote sensing image segmentation is a challenging task. In urban remote sensing, the presenceof occlusions and shadows often results in blurred or invisible object boundaries, thereby increasing the difficultyof segmentation. In this paper, an improved network with a cross-region self-attention mechanism for multi-scalefeatures based onDeepLabv3+is designed to address the difficulties of small object segmentation and blurred targetedge segmentation. First,we use CrossFormer as the backbone feature extraction network to achieve the interactionbetween large- and small-scale features, and establish self-attention associations between features at both large andsmall scales to capture global contextual feature information. Next, an improved atrous spatial pyramid poolingmodule is introduced to establish multi-scale feature maps with large- and small-scale feature associations, andattention vectors are added in the channel direction to enable adaptive adjustment of multi-scale channel features.The proposed networkmodel is validated using the PotsdamandVaihingen datasets. The experimental results showthat, compared with existing techniques, the network model designed in this paper can extract and fuse multiscaleinformation, more clearly extract edge information and small-scale information, and segment boundariesmore smoothly. Experimental results on public datasets demonstrate the superiority of ourmethod compared withseveral state-of-the-art networks. 展开更多
关键词 semantic segmentation remote sensing multiscale self-attention
下载PDF
Part-Whole Relational Few-Shot 3D Point Cloud Semantic Segmentation
4
作者 Shoukun Xu Lujun Zhang +2 位作者 Guangqi Jiang Yining Hua Yi Liu 《Computers, Materials & Continua》 SCIE EI 2024年第3期3021-3039,共19页
This paper focuses on the task of few-shot 3D point cloud semantic segmentation.Despite some progress,this task still encounters many issues due to the insufficient samples given,e.g.,incomplete object segmentation an... This paper focuses on the task of few-shot 3D point cloud semantic segmentation.Despite some progress,this task still encounters many issues due to the insufficient samples given,e.g.,incomplete object segmentation and inaccurate semantic discrimination.To tackle these issues,we first leverage part-whole relationships into the task of 3D point cloud semantic segmentation to capture semantic integrity,which is empowered by the dynamic capsule routing with the module of 3D Capsule Networks(CapsNets)in the embedding network.Concretely,the dynamic routing amalgamates geometric information of the 3D point cloud data to construct higher-level feature representations,which capture the relationships between object parts and their wholes.Secondly,we designed a multi-prototype enhancement module to enhance the prototype discriminability.Specifically,the single-prototype enhancement mechanism is expanded to the multi-prototype enhancement version for capturing rich semantics.Besides,the shot-correlation within the category is calculated via the interaction of different samples to enhance the intra-category similarity.Ablation studies prove that the involved part-whole relations and proposed multi-prototype enhancement module help to achieve complete object segmentation and improve semantic discrimination.Moreover,under the integration of these two modules,quantitative and qualitative experiments on two public benchmarks,including S3DIS and ScanNet,indicate the superior performance of the proposed framework on the task of 3D point cloud semantic segmentation,compared to some state-of-the-art methods. 展开更多
关键词 Few-shot point cloud semantic segmentation CapsNets
下载PDF
An Improved UNet Lightweight Network for Semantic Segmentation of Weed Images in Corn Fields
5
作者 Yu Zuo Wenwen Li 《Computers, Materials & Continua》 SCIE EI 2024年第6期4413-4431,共19页
In cornfields,factors such as the similarity between corn seedlings and weeds and the blurring of plant edge details pose challenges to corn and weed segmentation.In addition,remote areas such as farmland are usually ... In cornfields,factors such as the similarity between corn seedlings and weeds and the blurring of plant edge details pose challenges to corn and weed segmentation.In addition,remote areas such as farmland are usually constrained by limited computational resources and limited collected data.Therefore,it becomes necessary to lighten the model to better adapt to complex cornfield scene,and make full use of the limited data information.In this paper,we propose an improved image segmentation algorithm based on unet.Firstly,the inverted residual structure is introduced into the contraction path to reduce the number of parameters in the training process and improve the feature extraction ability;secondly,the pyramid pooling module is introduced to enhance the network’s ability of acquiring contextual information as well as the ability of dealing with the small target loss problem;and lastly,Finally,to further enhance the segmentation capability of the model,the squeeze and excitation mechanism is introduced in the expansion path.We used images of corn seedlings collected in the field and publicly available corn weed datasets to evaluate the improved model.The improved model has a total parameter of 3.79 M and miou can achieve 87.9%.The fps on a single 3050 ti video card is about 58.9.The experimental results show that the network proposed in this paper can quickly segment corn weeds in a cornfield scenario with good segmentation accuracy. 展开更多
关键词 semantic segmentation deep learning UNet pyramid pooling module
下载PDF
A semantic segmentation-based underwater acoustic image transmission framework for cooperative SLAM
6
作者 Jiaxu Li Guangyao Han +1 位作者 Shuai Chang Xiaomei Fu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第3期339-351,共13页
With the development of underwater sonar detection technology,simultaneous localization and mapping(SLAM)approach has attracted much attention in underwater navigation field in recent years.But the weak detection abil... With the development of underwater sonar detection technology,simultaneous localization and mapping(SLAM)approach has attracted much attention in underwater navigation field in recent years.But the weak detection ability of a single vehicle limits the SLAM performance in wide areas.Thereby,cooperative SLAM using multiple vehicles has become an important research direction.The key factor of cooperative SLAM is timely and efficient sonar image transmission among underwater vehicles.However,the limited bandwidth of underwater acoustic channels contradicts a large amount of sonar image data.It is essential to compress the images before transmission.Recently,deep neural networks have great value in image compression by virtue of the powerful learning ability of neural networks,but the existing sonar image compression methods based on neural network usually focus on the pixel-level information without the semantic-level information.In this paper,we propose a novel underwater acoustic transmission scheme called UAT-SSIC that includes semantic segmentation-based sonar image compression(SSIC)framework and the joint source-channel codec,to improve the accuracy of the semantic information of the reconstructed sonar image at the receiver.The SSIC framework consists of Auto-Encoder structure-based sonar image compression network,which is measured by a semantic segmentation network's residual.Considering that sonar images have the characteristics of blurred target edges,the semantic segmentation network used a special dilated convolution neural network(DiCNN)to enhance segmentation accuracy by expanding the range of receptive fields.The joint source-channel codec with unequal error protection is proposed that adjusts the power level of the transmitted data,which deal with sonar image transmission error caused by the serious underwater acoustic channel.Experiment results demonstrate that our method preserves more semantic information,with advantages over existing methods at the same compression ratio.It also improves the error tolerance and packet loss resistance of transmission. 展开更多
关键词 semantic segmentation Sonar image transmission Learning-based compression
下载PDF
ED-Ged:Nighttime Image Semantic Segmentation Based on Enhanced Detail and Bidirectional Guidance
7
作者 Xiaoli Yuan Jianxun Zhang +1 位作者 Xuejie Wang Zhuhong Chu 《Computers, Materials & Continua》 SCIE EI 2024年第8期2443-2462,共20页
Semantic segmentation of driving scene images is crucial for autonomous driving.While deep learning technology has significantly improved daytime image semantic segmentation,nighttime images pose challenges due to fac... Semantic segmentation of driving scene images is crucial for autonomous driving.While deep learning technology has significantly improved daytime image semantic segmentation,nighttime images pose challenges due to factors like poor lighting and overexposure,making it difficult to recognize small objects.To address this,we propose an Image Adaptive Enhancement(IAEN)module comprising a parameter predictor(Edip),multiple image processing filters(Mdif),and a Detail Processing Module(DPM).Edip combines image processing filters to predict parameters like exposure and hue,optimizing image quality.We adopt a novel image encoder to enhance parameter prediction accuracy by enabling Edip to handle features at different scales.DPM strengthens overlooked image details,extending the IAEN module’s functionality.After the segmentation network,we integrate a Depth Guided Filter(DGF)to refine segmentation outputs.The entire network is trained end-to-end,with segmentation results guiding parameter prediction optimization,promoting self-learning and network improvement.This lightweight and efficient network architecture is particularly suitable for addressing challenges in nighttime image segmentation.Extensive experiments validate significant performance improvements of our approach on the ACDC-night and Nightcity datasets. 展开更多
关键词 Night driving semantic segmentation nighttime image processing adverse illumination differentiable filters
下载PDF
A Random Fusion of Mix 3D and Polar Mix to Improve Semantic Segmentation Performance in 3D Lidar Point Cloud
8
作者 Bo Liu Li Feng Yufeng Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期845-862,共18页
This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information throu... This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information through a collection of 3D coordinates,have found wide-ranging applications.Data augmentation has emerged as a potent solution to the challenges posed by limited labeled data and the need to enhance model generalization capabilities.Much of the existing research is devoted to crafting novel data augmentation methods specifically for 3D lidar point clouds.However,there has been a lack of focus on making the most of the numerous existing augmentation techniques.Addressing this deficiency,this research investigates the possibility of combining two fundamental data augmentation strategies.The paper introduces PolarMix andMix3D,two commonly employed augmentation techniques,and presents a new approach,named RandomFusion.Instead of using a fixed or predetermined combination of augmentation methods,RandomFusion randomly chooses one method from a pool of options for each instance or sample.This innovative data augmentation technique randomly augments each point in the point cloud with either PolarMix or Mix3D.The crux of this strategy is the random choice between PolarMix and Mix3Dfor the augmentation of each point within the point cloud data set.The results of the experiments conducted validate the efficacy of the RandomFusion strategy in enhancing the performance of neural network models for 3D lidar point cloud semantic segmentation tasks.This is achieved without compromising computational efficiency.By examining the potential of merging different augmentation techniques,the research contributes significantly to a more comprehensive understanding of how to utilize existing augmentation methods for 3D lidar point clouds.RandomFusion data augmentation technique offers a simple yet effective method to leverage the diversity of augmentation techniques and boost the robustness of models.The insights gained from this research can pave the way for future work aimed at developing more advanced and efficient data augmentation strategies for 3D lidar point cloud analysis. 展开更多
关键词 3D lidar point cloud data augmentation RandomFusion semantic segmentation
下载PDF
Semantic Segmentation and YOLO Detector over Aerial Vehicle Images
9
作者 Asifa Mehmood Qureshi Abdul Haleem Butt +5 位作者 Abdulwahab Alazeb Naif Al Mudawi Mohammad Alonazi Nouf Abdullah Almujally Ahmad Jalal Hui Liu 《Computers, Materials & Continua》 SCIE EI 2024年第8期3315-3332,共18页
Intelligent vehicle tracking and detection are crucial tasks in the realm of highway management.However,vehicles come in a range of sizes,which is challenging to detect,affecting the traffic monitoring system’s overa... Intelligent vehicle tracking and detection are crucial tasks in the realm of highway management.However,vehicles come in a range of sizes,which is challenging to detect,affecting the traffic monitoring system’s overall accuracy.Deep learning is considered to be an efficient method for object detection in vision-based systems.In this paper,we proposed a vision-based vehicle detection and tracking system based on a You Look Only Once version 5(YOLOv5)detector combined with a segmentation technique.The model consists of six steps.In the first step,all the extracted traffic sequence images are subjected to pre-processing to remove noise and enhance the contrast level of the images.These pre-processed images are segmented by labelling each pixel to extract the uniform regions to aid the detection phase.A single-stage detector YOLOv5 is used to detect and locate vehicles in images.Each detection was exposed to Speeded Up Robust Feature(SURF)feature extraction to track multiple vehicles.Based on this,a unique number is assigned to each vehicle to easily locate them in the succeeding image frames by extracting them using the feature-matching technique.Further,we implemented a Kalman filter to track multiple vehicles.In the end,the vehicle path is estimated by using the centroid points of the rectangular bounding box predicted by the tracking algorithm.The experimental results and comparison reveal that our proposed vehicle detection and tracking system outperformed other state-of-the-art systems.The proposed implemented system provided 94.1%detection precision for Roundabout and 96.1%detection precision for Vehicle Aerial Imaging from Drone(VAID)datasets,respectively. 展开更多
关键词 semantic segmentation YOLOv5 vehicle detection and tracking Kalman filter SURF
下载PDF
PCB CT Image Element Segmentation Model Optimizing the Semantic Perception of Connectivity Relationship
10
作者 Chen Chen Kai Qiao +2 位作者 Jie Yang Jian Chen Bin Yan 《Computers, Materials & Continua》 SCIE EI 2024年第11期2629-2642,共14页
Computed Tomography(CT)is a commonly used technology in Printed Circuit Boards(PCB)non-destructive testing,and element segmentation of CT images is a key subsequent step.With the development of deep learning,researche... Computed Tomography(CT)is a commonly used technology in Printed Circuit Boards(PCB)non-destructive testing,and element segmentation of CT images is a key subsequent step.With the development of deep learning,researchers began to exploit the“pre-training and fine-tuning”training process for multi-element segmentation,reducing the time spent on manual annotation.However,the existing element segmentation model only focuses on the overall accuracy at the pixel level,ignoring whether the element connectivity relationship can be correctly identified.To this end,this paper proposes a PCB CT image element segmentation model optimizing the semantic perception of connectivity relationship(OSPC-seg).The overall training process adopts a“pre-training and fine-tuning”training process.A loss function that optimizes the semantic perception of circuit connectivity relationship(OSPC Loss)is designed from the aspect of alleviating the class imbalance problem and improving the correct connectivity rate.Also,the correct connectivity rate index(CCR)is proposed to evaluate the model’s connectivity relationship recognition capabilities.Experiments show that mIoU and CCR of OSPC-seg on our datasets are 90.1%and 97.0%,improved by 1.5%and 1.6%respectively compared with the baseline model.From visualization results,it can be seen that the segmentation performance of connection positions is significantly improved,which also demonstrates the effectiveness of OSPC-seg. 展开更多
关键词 semantic segmentation PCB non-destructive testing mask image modeling connectivity relationship
下载PDF
SGT-Net: A Transformer-Based Stratified Graph Convolutional Network for 3D Point Cloud Semantic Segmentation
11
作者 Suyi Liu Jianning Chi +2 位作者 Chengdong Wu Fang Xu Xiaosheng Yu 《Computers, Materials & Continua》 SCIE EI 2024年第6期4471-4489,共19页
In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and... In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and inherently sparse.Therefore,it is very difficult to extract long-range contexts and effectively aggregate local features for semantic segmentation in 3D point cloud space.Most current methods either focus on local feature aggregation or long-range context dependency,but fail to directly establish a global-local feature extractor to complete the point cloud semantic segmentation tasks.In this paper,we propose a Transformer-based stratified graph convolutional network(SGT-Net),which enlarges the effective receptive field and builds direct long-range dependency.Specifically,we first propose a novel dense-sparse sampling strategy that provides dense local vertices and sparse long-distance vertices for subsequent graph convolutional network(GCN).Secondly,we propose a multi-key self-attention mechanism based on the Transformer to further weight augmentation for crucial neighboring relationships and enlarge the effective receptive field.In addition,to further improve the efficiency of the network,we propose a similarity measurement module to determine whether the neighborhood near the center point is effective.We demonstrate the validity and superiority of our method on the S3DIS and ShapeNet datasets.Through ablation experiments and segmentation visualization,we verify that the SGT model can improve the performance of the point cloud semantic segmentation. 展开更多
关键词 3D point cloud semantic segmentation long-range contexts global-local feature graph convolutional network dense-sparse sampling strategy
下载PDF
Automatic Road Tunnel Crack Inspection Based on Crack Area Sensing and Multiscale Semantic Segmentation
12
作者 Dingping Chen Zhiheng Zhu +1 位作者 Jinyang Fu Jilin He 《Computers, Materials & Continua》 SCIE EI 2024年第4期1679-1703,共25页
The detection of crack defects on the walls of road tunnels is a crucial step in the process of ensuring travel safetyand performing routine tunnel maintenance. The automatic and accurate detection of cracks on the su... The detection of crack defects on the walls of road tunnels is a crucial step in the process of ensuring travel safetyand performing routine tunnel maintenance. The automatic and accurate detection of cracks on the surface of roadtunnels is the key to improving the maintenance efficiency of road tunnels. Machine vision technology combinedwith a deep neural network model is an effective means to realize the localization and identification of crackdefects on the surface of road tunnels.We propose a complete set of automatic inspection methods for identifyingcracks on the walls of road tunnels as a solution to the problem of difficulty in identifying cracks during manualmaintenance. First, a set of equipment applied to the real-time acquisition of high-definition images of walls inroad tunnels is designed. Images of walls in road tunnels are acquired based on the designed equipment, whereimages containing crack defects are manually identified and selected. Subsequently, the training and validationsets used to construct the crack inspection model are obtained based on the acquired images, whereas the regionscontaining cracks and the pixels of the cracks are finely labeled. After that, a crack area sensing module is designedbased on the proposed you only look once version 7 model combined with coordinate attention mechanism (CAYOLOV7) network to locate the crack regions in the road tunnel surface images. Only subimages containingcracks are acquired and sent to the multiscale semantic segmentation module for extraction of the pixels to whichthe cracks belong based on the DeepLab V3+ network. The precision and recall of the crack region localizationon the surface of a road tunnel based on our proposed method are 82.4% and 93.8%, respectively. Moreover, themean intersection over union (MIoU) and pixel accuracy (PA) values for achieving pixel-level detection accuracyare 76.84% and 78.29%, respectively. The experimental results on the dataset show that our proposed two-stagedetection method outperforms other state-of-the-art models in crack region localization and detection. Based onour proposedmethod, the images captured on the surface of a road tunnel can complete crack detection at a speed often frames/second, and the detection accuracy can reach 0.25 mm, which meets the requirements for maintenanceof an actual project. The designed CA-YOLO V7 network enables precise localization of the area to which a crackbelongs in images acquired under different environmental and lighting conditions in road tunnels. The improvedDeepLab V3+ network based on lightweighting is able to extract crack morphology in a given region more quicklywhile maintaining segmentation accuracy. The established model combines defect localization and segmentationmodels for the first time, realizing pixel-level defect localization and extraction on the surface of road tunnelsin complex environments, and is capable of determining the actual size of cracks based on the physical coordinatesystemafter camera calibration. The trainedmodelhas highaccuracy andcanbe extendedandapplied to embeddedcomputing devices for the assessment and repair of damaged areas in different types of road tunnels. 展开更多
关键词 Road tunnel crack inspection crack area sensing multiscale semantic segmentation CA-YOLO V7 DeepLab V3+
下载PDF
Industry-Oriented Detection Method of PCBA Defects Using Semantic Segmentation Models
13
作者 Yang Li Xiao Wang +10 位作者 Zhifan He Ze Wang Ke Cheng Sanchuan Ding Yijing Fan Xiaotao Li Yawen Niu Shanpeng Xiao Zhenqi Hao Bin Gao Huaqiang Wu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第6期1438-1446,共9页
Automated optical inspection(AOI)is a significant process in printed circuit board assembly(PCBA)production lines which aims to detect tiny defects in PCBAs.Existing AOI equipment has several deficiencies including lo... Automated optical inspection(AOI)is a significant process in printed circuit board assembly(PCBA)production lines which aims to detect tiny defects in PCBAs.Existing AOI equipment has several deficiencies including low throughput,large computation cost,high latency,and poor flexibility,which limits the efficiency of online PCBA inspection.In this paper,a novel PCBA defect detection method based on a lightweight deep convolution neural network is proposed.In this method,the semantic segmentation model is combined with a rule-based defect recognition algorithm to build up a defect detection frame-work.To improve the performance of the model,extensive real PCBA images are collected from production lines as datasets.Some optimization methods have been applied in the model according to production demand and enable integration in lightweight computing devices.Experiment results show that the production line using our method realizes a throughput more than three times higher than traditional methods.Our method can be integrated into a lightweight inference system and pro-mote the flexibility of AOI.The proposed method builds up a general paradigm and excellent example for model design and optimization oriented towards industrial requirements. 展开更多
关键词 Automated optical inspection(AOI) deep learning defect detection printed circuit board assembly(PCBA) semantic segmentation.
下载PDF
Triple-Branch Asymmetric Network for Real-time Semantic Segmentation of Road Scenes
14
作者 Yazhi Zhang Xuguang Zhang Hui Yu 《Instrumentation》 2024年第2期72-82,共11页
As the field of autonomous driving evolves, real-time semantic segmentation has become a crucial part of computer vision tasks. However, most existing methods use lightweight convolution to reduce the computational ef... As the field of autonomous driving evolves, real-time semantic segmentation has become a crucial part of computer vision tasks. However, most existing methods use lightweight convolution to reduce the computational effort, resulting in lower accuracy. To address this problem, we construct TBANet, a network with an encoder-decoder structure for efficient feature extraction. In the encoder part, the TBA module is designed to extract details and the ETBA module is used to learn semantic representations in a high-dimensional space. In the decoder part, we design a combination of multiple upsampling methods to aggregate features with less computational overhead. We validate the efficiency of TBANet on the Cityscapes dataset. It achieves 75.1% mean Intersection over Union(mIoU) with only 2.07 million parameters and can reach 90.3 Frames Per Second(FPS). 展开更多
关键词 encoder-decoder architecture lightweight convolution real-time semantic segmentation
下载PDF
Image Semantic Segmentation for Autonomous Driving Based on Improved U-Net
15
作者 Chuanlong Sun Hong Zhao +2 位作者 Liang Mu Fuliang Xu Laiwei Lu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第7期787-801,共15页
Image semantic segmentation has become an essential part of autonomous driving.To further improve the generalization ability and the robustness of semantic segmentation algorithms,a lightweight algorithm network based... Image semantic segmentation has become an essential part of autonomous driving.To further improve the generalization ability and the robustness of semantic segmentation algorithms,a lightweight algorithm network based on Squeeze-and-Excitation Attention Mechanism(SE)and Depthwise Separable Convolution(DSC)is designed.Meanwhile,Adam-GC,an Adam optimization algorithm based on Gradient Compression(GC),is proposed to improve the training speed,segmentation accuracy,generalization ability and stability of the algorithm network.To verify and compare the effectiveness of the algorithm network proposed in this paper,the trained networkmodel is used for experimental verification and comparative test on the Cityscapes semantic segmentation dataset.The validation and comparison results show that the overall segmentation results of the algorithmnetwork can achieve 78.02%MIoU on Cityscapes validation set,which is better than the basic algorithm network and the other latest semantic segmentation algorithms network.Besides meeting the stability and accuracy requirements,it has a particular significance for the development of image semantic segmentation. 展开更多
关键词 Deep learning semantic segmentation attention mechanism depthwise separable convolution gradient compression
下载PDF
ARGA-Unet:Advanced U-net segmentation model using residual grouped convolution and attention mechanism for brain tumor MRI image segmentation
16
作者 Siyi XUN Yan ZHANG +7 位作者 Sixu DUAN Mingwei WANG Jiangang CHEN Tong TONG Qinquan GAO Chantong LAM Menghan HU Tao TAN 《虚拟现实与智能硬件(中英文)》 EI 2024年第3期203-216,共14页
Background Magnetic resonance imaging(MRI)has played an important role in the rapid growth of medical imaging diagnostic technology,especially in the diagnosis and treatment of brain tumors owing to its non invasive c... Background Magnetic resonance imaging(MRI)has played an important role in the rapid growth of medical imaging diagnostic technology,especially in the diagnosis and treatment of brain tumors owing to its non invasive characteristics and superior soft tissue contrast.However,brain tumors are characterized by high non uniformity and non-obvious boundaries in MRI images because of their invasive and highly heterogeneous nature.In addition,the labeling of tumor areas is time-consuming and laborious.Methods To address these issues,this study uses a residual grouped convolution module,convolutional block attention module,and bilinear interpolation upsampling method to improve the classical segmentation network U-net.The influence of network normalization,loss function,and network depth on segmentation performance is further considered.Results In the experiments,the Dice score of the proposed segmentation model reached 97.581%,which is 12.438%higher than that of traditional U-net,demonstrating the effective segmentation of MRI brain tumor images.Conclusions In conclusion,we use the improved U-net network to achieve a good segmentation effect of brain tumor MRI images. 展开更多
关键词 Brain tumor MRI u-net segmentation Attention mechanism Deep learning
下载PDF
Axial Assembled Correspondence Network for Few-Shot Semantic Segmentation 被引量:2
17
作者 Yu Liu Bin Jiang Jiaming Xu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第3期711-721,共11页
Few-shot semantic segmentation aims at training a model that can segment novel classes in a query image with only a few densely annotated support exemplars.It remains a challenge because of large intra-class variation... Few-shot semantic segmentation aims at training a model that can segment novel classes in a query image with only a few densely annotated support exemplars.It remains a challenge because of large intra-class variations between the support and query images.Existing approaches utilize 4D convolutions to mine semantic correspondence between the support and query images.However,they still suffer from heavy computation,sparse correspondence,and large memory.We propose axial assembled correspondence network(AACNet)to alleviate these issues.The key point of AACNet is the proposed axial assembled 4D kernel,which constructs the basic block for semantic correspondence encoder(SCE).Furthermore,we propose the deblurring equations to provide more robust correspondence for the aforementioned SCE and design a novel fusion module to mix correspondences in a learnable manner.Experiments on PASCAL-5~i reveal that our AACNet achieves a mean intersection-over-union score of 65.9%for 1-shot segmentation and 70.6%for 5-shot segmentation,surpassing the state-of-the-art method by 5.8%and 5.0%respectively. 展开更多
关键词 Artificial intelligence computer vision deep convolutional neural network few-shot semantic segmentation
下载PDF
Multi-task Learning of Semantic Segmentation and Height Estimation for Multi-modal Remote Sensing Images 被引量:2
18
作者 Mengyu WANG Zhiyuan YAN +2 位作者 Yingchao FENG Wenhui DIAO Xian SUN 《Journal of Geodesy and Geoinformation Science》 CSCD 2023年第4期27-39,共13页
Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively u... Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively utilize multi-modal remote sensing data to break through the performance bottleneck of single-modal interpretation.In addition,semantic segmentation and height estimation in remote sensing data are two tasks with strong correlation,but existing methods usually study individual tasks separately,which leads to high computational resource overhead.To this end,we propose a Multi-Task learning framework for Multi-Modal remote sensing images(MM_MT).Specifically,we design a Cross-Modal Feature Fusion(CMFF)method,which aggregates complementary information of different modalities to improve the accuracy of semantic segmentation and height estimation.Besides,a dual-stream multi-task learning method is introduced for Joint Semantic Segmentation and Height Estimation(JSSHE),extracting common features in a shared network to save time and resources,and then learning task-specific features in two task branches.Experimental results on the public multi-modal remote sensing image dataset Potsdam show that compared to training two tasks independently,multi-task learning saves 20%of training time and achieves competitive performance with mIoU of 83.02%for semantic segmentation and accuracy of 95.26%for height estimation. 展开更多
关键词 MULTI-MODAL MULTI-TASK semantic segmentation height estimation convolutional neural network
下载PDF
Semantic segmentation of pyramidal neuron skeletons using geometric deep learning 被引量:1
19
作者 Lanlan Li Jing Qi +1 位作者 Yi Geng Jingpeng Wu 《Journal of Innovative Optical Health Sciences》 SCIE EI CSCD 2023年第6期69-76,共8页
Neurons can be abstractly represented as skeletons due to the filament nature of neurites.With the rapid development of imaging and image analysis techniques,an increasing amount of neuron skeleton data is being produ... Neurons can be abstractly represented as skeletons due to the filament nature of neurites.With the rapid development of imaging and image analysis techniques,an increasing amount of neuron skeleton data is being produced.In some scienti fic studies,it is necessary to dissect the axons and dendrites,which is typically done manually and is both tedious and time-consuming.To automate this process,we have developed a method that relies solely on neuronal skeletons using Geometric Deep Learning(GDL).We demonstrate the effectiveness of this method using pyramidal neurons in mammalian brains,and the results are promising for its application in neuroscience studies. 展开更多
关键词 Pyramidal neuron geometric deep learning neuron skeleton semantic segmentation point cloud.
下载PDF
CFSA-Net:Efficient Large-Scale Point Cloud Semantic Segmentation Based on Cross-Fusion Self-Attention 被引量:1
20
作者 Jun Shu Shuai Wang +1 位作者 Shiqi Yu Jie Zhang 《Computers, Materials & Continua》 SCIE EI 2023年第12期2677-2697,共21页
Traditional models for semantic segmentation in point clouds primarily focus on smaller scales.However,in real-world applications,point clouds often exhibit larger scales,leading to heavy computational and memory requ... Traditional models for semantic segmentation in point clouds primarily focus on smaller scales.However,in real-world applications,point clouds often exhibit larger scales,leading to heavy computational and memory requirements.The key to handling large-scale point clouds lies in leveraging random sampling,which offers higher computational efficiency and lower memory consumption compared to other sampling methods.Nevertheless,the use of random sampling can potentially result in the loss of crucial points during the encoding stage.To address these issues,this paper proposes cross-fusion self-attention network(CFSA-Net),a lightweight and efficient network architecture specifically designed for directly processing large-scale point clouds.At the core of this network is the incorporation of random sampling alongside a local feature extraction module based on cross-fusion self-attention(CFSA).This module effectively integrates long-range contextual dependencies between points by employing hierarchical position encoding(HPC).Furthermore,it enhances the interaction between each point’s coordinates and feature information through cross-fusion self-attention pooling,enabling the acquisition of more comprehensive geometric information.Finally,a residual optimization(RO)structure is introduced to extend the receptive field of individual points by stacking hierarchical position encoding and cross-fusion self-attention pooling,thereby reducing the impact of information loss caused by random sampling.Experimental results on the Stanford Large-Scale 3D Indoor Spaces(S3DIS),Semantic3D,and SemanticKITTI datasets demonstrate the superiority of this algorithm over advanced approaches such as RandLA-Net and KPConv.These findings underscore the excellent performance of CFSA-Net in large-scale 3D semantic segmentation. 展开更多
关键词 semantic segmentation large-scale point cloud random sampling cross-fusion self-attention
下载PDF
上一页 1 2 140 下一页 到第
使用帮助 返回顶部