期刊文献+
共找到2,813篇文章
< 1 2 141 >
每页显示 20 50 100
Axial Assembled Correspondence Network for Few-Shot Semantic Segmentation 被引量:2
1
作者 Yu Liu Bin Jiang Jiaming Xu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第3期711-721,共11页
Few-shot semantic segmentation aims at training a model that can segment novel classes in a query image with only a few densely annotated support exemplars.It remains a challenge because of large intra-class variation... Few-shot semantic segmentation aims at training a model that can segment novel classes in a query image with only a few densely annotated support exemplars.It remains a challenge because of large intra-class variations between the support and query images.Existing approaches utilize 4D convolutions to mine semantic correspondence between the support and query images.However,they still suffer from heavy computation,sparse correspondence,and large memory.We propose axial assembled correspondence network(AACNet)to alleviate these issues.The key point of AACNet is the proposed axial assembled 4D kernel,which constructs the basic block for semantic correspondence encoder(SCE).Furthermore,we propose the deblurring equations to provide more robust correspondence for the aforementioned SCE and design a novel fusion module to mix correspondences in a learnable manner.Experiments on PASCAL-5~i reveal that our AACNet achieves a mean intersection-over-union score of 65.9%for 1-shot segmentation and 70.6%for 5-shot segmentation,surpassing the state-of-the-art method by 5.8%and 5.0%respectively. 展开更多
关键词 Artificial intelligence computer vision deep convolutional neural network few-shot semantic segmentation
下载PDF
Part-Whole Relational Few-Shot 3D Point Cloud Semantic Segmentation
2
作者 Shoukun Xu Lujun Zhang +2 位作者 Guangqi Jiang Yining Hua Yi Liu 《Computers, Materials & Continua》 SCIE EI 2024年第3期3021-3039,共19页
This paper focuses on the task of few-shot 3D point cloud semantic segmentation.Despite some progress,this task still encounters many issues due to the insufficient samples given,e.g.,incomplete object segmentation an... This paper focuses on the task of few-shot 3D point cloud semantic segmentation.Despite some progress,this task still encounters many issues due to the insufficient samples given,e.g.,incomplete object segmentation and inaccurate semantic discrimination.To tackle these issues,we first leverage part-whole relationships into the task of 3D point cloud semantic segmentation to capture semantic integrity,which is empowered by the dynamic capsule routing with the module of 3D Capsule Networks(CapsNets)in the embedding network.Concretely,the dynamic routing amalgamates geometric information of the 3D point cloud data to construct higher-level feature representations,which capture the relationships between object parts and their wholes.Secondly,we designed a multi-prototype enhancement module to enhance the prototype discriminability.Specifically,the single-prototype enhancement mechanism is expanded to the multi-prototype enhancement version for capturing rich semantics.Besides,the shot-correlation within the category is calculated via the interaction of different samples to enhance the intra-category similarity.Ablation studies prove that the involved part-whole relations and proposed multi-prototype enhancement module help to achieve complete object segmentation and improve semantic discrimination.Moreover,under the integration of these two modules,quantitative and qualitative experiments on two public benchmarks,including S3DIS and ScanNet,indicate the superior performance of the proposed framework on the task of 3D point cloud semantic segmentation,compared to some state-of-the-art methods. 展开更多
关键词 few-shot point cloud semantic segmentation CapsNets
下载PDF
Semantic segmentation-based semantic communication system for image transmission
3
作者 Jiale Wu Celimuge Wu +4 位作者 Yangfei Lin Tsutomu Yoshinaga Lei Zhong Xianfu Chen Yusheng Ji 《Digital Communications and Networks》 SCIE CSCD 2024年第3期519-527,共9页
With the rapid development of artificial intelligence and the widespread use of the Internet of Things, semantic communication, as an emerging communication paradigm, has been attracting great interest. Taking image t... With the rapid development of artificial intelligence and the widespread use of the Internet of Things, semantic communication, as an emerging communication paradigm, has been attracting great interest. Taking image transmission as an example, from the semantic communication's view, not all pixels in the images are equally important for certain receivers. The existing semantic communication systems directly perform semantic encoding and decoding on the whole image, in which the region of interest cannot be identified. In this paper, we propose a novel semantic communication system for image transmission that can distinguish between Regions Of Interest (ROI) and Regions Of Non-Interest (RONI) based on semantic segmentation, where a semantic segmentation algorithm is used to classify each pixel of the image and distinguish ROI and RONI. The system also enables high-quality transmission of ROI with lower communication overheads by transmissions through different semantic communication networks with different bandwidth requirements. An improved metric θPSNR is proposed to evaluate the transmission accuracy of the novel semantic transmission network. Experimental results show that our proposed system achieves a significant performance improvement compared with existing approaches, namely, existing semantic communication approaches and the conventional approach without semantics. 展开更多
关键词 semantic Communication semantic segmentation Image transmission Image compression Deep learning
下载PDF
CrossFormer Embedding DeepLabv3+ for Remote Sensing Images Semantic Segmentation
4
作者 Qixiang Tong Zhipeng Zhu +2 位作者 Min Zhang Kerui Cao Haihua Xing 《Computers, Materials & Continua》 SCIE EI 2024年第4期1353-1375,共23页
High-resolution remote sensing image segmentation is a challenging task. In urban remote sensing, the presenceof occlusions and shadows often results in blurred or invisible object boundaries, thereby increasing the d... High-resolution remote sensing image segmentation is a challenging task. In urban remote sensing, the presenceof occlusions and shadows often results in blurred or invisible object boundaries, thereby increasing the difficultyof segmentation. In this paper, an improved network with a cross-region self-attention mechanism for multi-scalefeatures based onDeepLabv3+is designed to address the difficulties of small object segmentation and blurred targetedge segmentation. First,we use CrossFormer as the backbone feature extraction network to achieve the interactionbetween large- and small-scale features, and establish self-attention associations between features at both large andsmall scales to capture global contextual feature information. Next, an improved atrous spatial pyramid poolingmodule is introduced to establish multi-scale feature maps with large- and small-scale feature associations, andattention vectors are added in the channel direction to enable adaptive adjustment of multi-scale channel features.The proposed networkmodel is validated using the PotsdamandVaihingen datasets. The experimental results showthat, compared with existing techniques, the network model designed in this paper can extract and fuse multiscaleinformation, more clearly extract edge information and small-scale information, and segment boundariesmore smoothly. Experimental results on public datasets demonstrate the superiority of ourmethod compared withseveral state-of-the-art networks. 展开更多
关键词 semantic segmentation remote sensing multiscale self-attention
下载PDF
An Improved UNet Lightweight Network for Semantic Segmentation of Weed Images in Corn Fields
5
作者 Yu Zuo Wenwen Li 《Computers, Materials & Continua》 SCIE EI 2024年第6期4413-4431,共19页
In cornfields,factors such as the similarity between corn seedlings and weeds and the blurring of plant edge details pose challenges to corn and weed segmentation.In addition,remote areas such as farmland are usually ... In cornfields,factors such as the similarity between corn seedlings and weeds and the blurring of plant edge details pose challenges to corn and weed segmentation.In addition,remote areas such as farmland are usually constrained by limited computational resources and limited collected data.Therefore,it becomes necessary to lighten the model to better adapt to complex cornfield scene,and make full use of the limited data information.In this paper,we propose an improved image segmentation algorithm based on unet.Firstly,the inverted residual structure is introduced into the contraction path to reduce the number of parameters in the training process and improve the feature extraction ability;secondly,the pyramid pooling module is introduced to enhance the network’s ability of acquiring contextual information as well as the ability of dealing with the small target loss problem;and lastly,Finally,to further enhance the segmentation capability of the model,the squeeze and excitation mechanism is introduced in the expansion path.We used images of corn seedlings collected in the field and publicly available corn weed datasets to evaluate the improved model.The improved model has a total parameter of 3.79 M and miou can achieve 87.9%.The fps on a single 3050 ti video card is about 58.9.The experimental results show that the network proposed in this paper can quickly segment corn weeds in a cornfield scenario with good segmentation accuracy. 展开更多
关键词 semantic segmentation deep learning UNet pyramid pooling module
下载PDF
A semantic segmentation-based underwater acoustic image transmission framework for cooperative SLAM
6
作者 Jiaxu Li Guangyao Han +1 位作者 Shuai Chang Xiaomei Fu 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第3期339-351,共13页
With the development of underwater sonar detection technology,simultaneous localization and mapping(SLAM)approach has attracted much attention in underwater navigation field in recent years.But the weak detection abil... With the development of underwater sonar detection technology,simultaneous localization and mapping(SLAM)approach has attracted much attention in underwater navigation field in recent years.But the weak detection ability of a single vehicle limits the SLAM performance in wide areas.Thereby,cooperative SLAM using multiple vehicles has become an important research direction.The key factor of cooperative SLAM is timely and efficient sonar image transmission among underwater vehicles.However,the limited bandwidth of underwater acoustic channels contradicts a large amount of sonar image data.It is essential to compress the images before transmission.Recently,deep neural networks have great value in image compression by virtue of the powerful learning ability of neural networks,but the existing sonar image compression methods based on neural network usually focus on the pixel-level information without the semantic-level information.In this paper,we propose a novel underwater acoustic transmission scheme called UAT-SSIC that includes semantic segmentation-based sonar image compression(SSIC)framework and the joint source-channel codec,to improve the accuracy of the semantic information of the reconstructed sonar image at the receiver.The SSIC framework consists of Auto-Encoder structure-based sonar image compression network,which is measured by a semantic segmentation network's residual.Considering that sonar images have the characteristics of blurred target edges,the semantic segmentation network used a special dilated convolution neural network(DiCNN)to enhance segmentation accuracy by expanding the range of receptive fields.The joint source-channel codec with unequal error protection is proposed that adjusts the power level of the transmitted data,which deal with sonar image transmission error caused by the serious underwater acoustic channel.Experiment results demonstrate that our method preserves more semantic information,with advantages over existing methods at the same compression ratio.It also improves the error tolerance and packet loss resistance of transmission. 展开更多
关键词 semantic segmentation Sonar image transmission Learning-based compression
下载PDF
ED-Ged:Nighttime Image Semantic Segmentation Based on Enhanced Detail and Bidirectional Guidance
7
作者 Xiaoli Yuan Jianxun Zhang +1 位作者 Xuejie Wang Zhuhong Chu 《Computers, Materials & Continua》 SCIE EI 2024年第8期2443-2462,共20页
Semantic segmentation of driving scene images is crucial for autonomous driving.While deep learning technology has significantly improved daytime image semantic segmentation,nighttime images pose challenges due to fac... Semantic segmentation of driving scene images is crucial for autonomous driving.While deep learning technology has significantly improved daytime image semantic segmentation,nighttime images pose challenges due to factors like poor lighting and overexposure,making it difficult to recognize small objects.To address this,we propose an Image Adaptive Enhancement(IAEN)module comprising a parameter predictor(Edip),multiple image processing filters(Mdif),and a Detail Processing Module(DPM).Edip combines image processing filters to predict parameters like exposure and hue,optimizing image quality.We adopt a novel image encoder to enhance parameter prediction accuracy by enabling Edip to handle features at different scales.DPM strengthens overlooked image details,extending the IAEN module’s functionality.After the segmentation network,we integrate a Depth Guided Filter(DGF)to refine segmentation outputs.The entire network is trained end-to-end,with segmentation results guiding parameter prediction optimization,promoting self-learning and network improvement.This lightweight and efficient network architecture is particularly suitable for addressing challenges in nighttime image segmentation.Extensive experiments validate significant performance improvements of our approach on the ACDC-night and Nightcity datasets. 展开更多
关键词 Night driving semantic segmentation nighttime image processing adverse illumination differentiable filters
下载PDF
A Random Fusion of Mix 3D and Polar Mix to Improve Semantic Segmentation Performance in 3D Lidar Point Cloud
8
作者 Bo Liu Li Feng Yufeng Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期845-862,共18页
This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information throu... This paper focuses on the effective utilization of data augmentation techniques for 3Dlidar point clouds to enhance the performance of neural network models.These point clouds,which represent spatial information through a collection of 3D coordinates,have found wide-ranging applications.Data augmentation has emerged as a potent solution to the challenges posed by limited labeled data and the need to enhance model generalization capabilities.Much of the existing research is devoted to crafting novel data augmentation methods specifically for 3D lidar point clouds.However,there has been a lack of focus on making the most of the numerous existing augmentation techniques.Addressing this deficiency,this research investigates the possibility of combining two fundamental data augmentation strategies.The paper introduces PolarMix andMix3D,two commonly employed augmentation techniques,and presents a new approach,named RandomFusion.Instead of using a fixed or predetermined combination of augmentation methods,RandomFusion randomly chooses one method from a pool of options for each instance or sample.This innovative data augmentation technique randomly augments each point in the point cloud with either PolarMix or Mix3D.The crux of this strategy is the random choice between PolarMix and Mix3Dfor the augmentation of each point within the point cloud data set.The results of the experiments conducted validate the efficacy of the RandomFusion strategy in enhancing the performance of neural network models for 3D lidar point cloud semantic segmentation tasks.This is achieved without compromising computational efficiency.By examining the potential of merging different augmentation techniques,the research contributes significantly to a more comprehensive understanding of how to utilize existing augmentation methods for 3D lidar point clouds.RandomFusion data augmentation technique offers a simple yet effective method to leverage the diversity of augmentation techniques and boost the robustness of models.The insights gained from this research can pave the way for future work aimed at developing more advanced and efficient data augmentation strategies for 3D lidar point cloud analysis. 展开更多
关键词 3D lidar point cloud data augmentation RandomFusion semantic segmentation
下载PDF
Semantic Segmentation and YOLO Detector over Aerial Vehicle Images
9
作者 Asifa Mehmood Qureshi Abdul Haleem Butt +5 位作者 Abdulwahab Alazeb Naif Al Mudawi Mohammad Alonazi Nouf Abdullah Almujally Ahmad Jalal Hui Liu 《Computers, Materials & Continua》 SCIE EI 2024年第8期3315-3332,共18页
Intelligent vehicle tracking and detection are crucial tasks in the realm of highway management.However,vehicles come in a range of sizes,which is challenging to detect,affecting the traffic monitoring system’s overa... Intelligent vehicle tracking and detection are crucial tasks in the realm of highway management.However,vehicles come in a range of sizes,which is challenging to detect,affecting the traffic monitoring system’s overall accuracy.Deep learning is considered to be an efficient method for object detection in vision-based systems.In this paper,we proposed a vision-based vehicle detection and tracking system based on a You Look Only Once version 5(YOLOv5)detector combined with a segmentation technique.The model consists of six steps.In the first step,all the extracted traffic sequence images are subjected to pre-processing to remove noise and enhance the contrast level of the images.These pre-processed images are segmented by labelling each pixel to extract the uniform regions to aid the detection phase.A single-stage detector YOLOv5 is used to detect and locate vehicles in images.Each detection was exposed to Speeded Up Robust Feature(SURF)feature extraction to track multiple vehicles.Based on this,a unique number is assigned to each vehicle to easily locate them in the succeeding image frames by extracting them using the feature-matching technique.Further,we implemented a Kalman filter to track multiple vehicles.In the end,the vehicle path is estimated by using the centroid points of the rectangular bounding box predicted by the tracking algorithm.The experimental results and comparison reveal that our proposed vehicle detection and tracking system outperformed other state-of-the-art systems.The proposed implemented system provided 94.1%detection precision for Roundabout and 96.1%detection precision for Vehicle Aerial Imaging from Drone(VAID)datasets,respectively. 展开更多
关键词 semantic segmentation YOLOv5 vehicle detection and tracking Kalman filter SURF
下载PDF
SGT-Net: A Transformer-Based Stratified Graph Convolutional Network for 3D Point Cloud Semantic Segmentation
10
作者 Suyi Liu Jianning Chi +2 位作者 Chengdong Wu Fang Xu Xiaosheng Yu 《Computers, Materials & Continua》 SCIE EI 2024年第6期4471-4489,共19页
In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and... In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and inherently sparse.Therefore,it is very difficult to extract long-range contexts and effectively aggregate local features for semantic segmentation in 3D point cloud space.Most current methods either focus on local feature aggregation or long-range context dependency,but fail to directly establish a global-local feature extractor to complete the point cloud semantic segmentation tasks.In this paper,we propose a Transformer-based stratified graph convolutional network(SGT-Net),which enlarges the effective receptive field and builds direct long-range dependency.Specifically,we first propose a novel dense-sparse sampling strategy that provides dense local vertices and sparse long-distance vertices for subsequent graph convolutional network(GCN).Secondly,we propose a multi-key self-attention mechanism based on the Transformer to further weight augmentation for crucial neighboring relationships and enlarge the effective receptive field.In addition,to further improve the efficiency of the network,we propose a similarity measurement module to determine whether the neighborhood near the center point is effective.We demonstrate the validity and superiority of our method on the S3DIS and ShapeNet datasets.Through ablation experiments and segmentation visualization,we verify that the SGT model can improve the performance of the point cloud semantic segmentation. 展开更多
关键词 3D point cloud semantic segmentation long-range contexts global-local feature graph convolutional network dense-sparse sampling strategy
下载PDF
Automatic Road Tunnel Crack Inspection Based on Crack Area Sensing and Multiscale Semantic Segmentation
11
作者 Dingping Chen Zhiheng Zhu +1 位作者 Jinyang Fu Jilin He 《Computers, Materials & Continua》 SCIE EI 2024年第4期1679-1703,共25页
The detection of crack defects on the walls of road tunnels is a crucial step in the process of ensuring travel safetyand performing routine tunnel maintenance. The automatic and accurate detection of cracks on the su... The detection of crack defects on the walls of road tunnels is a crucial step in the process of ensuring travel safetyand performing routine tunnel maintenance. The automatic and accurate detection of cracks on the surface of roadtunnels is the key to improving the maintenance efficiency of road tunnels. Machine vision technology combinedwith a deep neural network model is an effective means to realize the localization and identification of crackdefects on the surface of road tunnels.We propose a complete set of automatic inspection methods for identifyingcracks on the walls of road tunnels as a solution to the problem of difficulty in identifying cracks during manualmaintenance. First, a set of equipment applied to the real-time acquisition of high-definition images of walls inroad tunnels is designed. Images of walls in road tunnels are acquired based on the designed equipment, whereimages containing crack defects are manually identified and selected. Subsequently, the training and validationsets used to construct the crack inspection model are obtained based on the acquired images, whereas the regionscontaining cracks and the pixels of the cracks are finely labeled. After that, a crack area sensing module is designedbased on the proposed you only look once version 7 model combined with coordinate attention mechanism (CAYOLOV7) network to locate the crack regions in the road tunnel surface images. Only subimages containingcracks are acquired and sent to the multiscale semantic segmentation module for extraction of the pixels to whichthe cracks belong based on the DeepLab V3+ network. The precision and recall of the crack region localizationon the surface of a road tunnel based on our proposed method are 82.4% and 93.8%, respectively. Moreover, themean intersection over union (MIoU) and pixel accuracy (PA) values for achieving pixel-level detection accuracyare 76.84% and 78.29%, respectively. The experimental results on the dataset show that our proposed two-stagedetection method outperforms other state-of-the-art models in crack region localization and detection. Based onour proposedmethod, the images captured on the surface of a road tunnel can complete crack detection at a speed often frames/second, and the detection accuracy can reach 0.25 mm, which meets the requirements for maintenanceof an actual project. The designed CA-YOLO V7 network enables precise localization of the area to which a crackbelongs in images acquired under different environmental and lighting conditions in road tunnels. The improvedDeepLab V3+ network based on lightweighting is able to extract crack morphology in a given region more quicklywhile maintaining segmentation accuracy. The established model combines defect localization and segmentationmodels for the first time, realizing pixel-level defect localization and extraction on the surface of road tunnelsin complex environments, and is capable of determining the actual size of cracks based on the physical coordinatesystemafter camera calibration. The trainedmodelhas highaccuracy andcanbe extendedandapplied to embeddedcomputing devices for the assessment and repair of damaged areas in different types of road tunnels. 展开更多
关键词 Road tunnel crack inspection crack area sensing multiscale semantic segmentation CA-YOLO V7 DeepLab V3+
下载PDF
Industry-Oriented Detection Method of PCBA Defects Using Semantic Segmentation Models
12
作者 Yang Li Xiao Wang +10 位作者 Zhifan He Ze Wang Ke Cheng Sanchuan Ding Yijing Fan Xiaotao Li Yawen Niu Shanpeng Xiao Zhenqi Hao Bin Gao Huaqiang Wu 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第6期1438-1446,共9页
Automated optical inspection(AOI)is a significant process in printed circuit board assembly(PCBA)production lines which aims to detect tiny defects in PCBAs.Existing AOI equipment has several deficiencies including lo... Automated optical inspection(AOI)is a significant process in printed circuit board assembly(PCBA)production lines which aims to detect tiny defects in PCBAs.Existing AOI equipment has several deficiencies including low throughput,large computation cost,high latency,and poor flexibility,which limits the efficiency of online PCBA inspection.In this paper,a novel PCBA defect detection method based on a lightweight deep convolution neural network is proposed.In this method,the semantic segmentation model is combined with a rule-based defect recognition algorithm to build up a defect detection frame-work.To improve the performance of the model,extensive real PCBA images are collected from production lines as datasets.Some optimization methods have been applied in the model according to production demand and enable integration in lightweight computing devices.Experiment results show that the production line using our method realizes a throughput more than three times higher than traditional methods.Our method can be integrated into a lightweight inference system and pro-mote the flexibility of AOI.The proposed method builds up a general paradigm and excellent example for model design and optimization oriented towards industrial requirements. 展开更多
关键词 Automated optical inspection(AOI) deep learning defect detection printed circuit board assembly(PCBA) semantic segmentation.
下载PDF
Triple-Branch Asymmetric Network for Real-time Semantic Segmentation of Road Scenes
13
作者 Yazhi Zhang Xuguang Zhang Hui Yu 《Instrumentation》 2024年第2期72-82,共11页
As the field of autonomous driving evolves, real-time semantic segmentation has become a crucial part of computer vision tasks. However, most existing methods use lightweight convolution to reduce the computational ef... As the field of autonomous driving evolves, real-time semantic segmentation has become a crucial part of computer vision tasks. However, most existing methods use lightweight convolution to reduce the computational effort, resulting in lower accuracy. To address this problem, we construct TBANet, a network with an encoder-decoder structure for efficient feature extraction. In the encoder part, the TBA module is designed to extract details and the ETBA module is used to learn semantic representations in a high-dimensional space. In the decoder part, we design a combination of multiple upsampling methods to aggregate features with less computational overhead. We validate the efficiency of TBANet on the Cityscapes dataset. It achieves 75.1% mean Intersection over Union(mIoU) with only 2.07 million parameters and can reach 90.3 Frames Per Second(FPS). 展开更多
关键词 encoder-decoder architecture lightweight convolution real-time semantic segmentation
下载PDF
Distilling base-and-meta network with contrastive learning for few-shot semantic segmentation
14
作者 Xinyue Chen Yueyi Wang +1 位作者 Yingyue Xu Miaojing Shi 《Autonomous Intelligent Systems》 EI 2023年第1期1-11,共11页
Current studies in few-shot semantic segmentation mostly utilize meta-learning frameworks to obtain models that can be generalized to new categories.However,these models trained on base classes with sufficient annotat... Current studies in few-shot semantic segmentation mostly utilize meta-learning frameworks to obtain models that can be generalized to new categories.However,these models trained on base classes with sufficient annotated samples are biased towards these base classes,which results in semantic confusion and ambiguity between base classes and new classes.A strategy is to use an additional base learner to recognize the objects of base classes and then refine the prediction results output by the meta learner.In this way,the interaction between these two learners and the way of combining results from the two learners are important.This paper proposes a new model,namely Distilling Base and Meta(DBAM)network by using self-attention mechanism and contrastive learning to enhance the few-shot segmentation performance.First,the self-attention-based ensemble module(SEM)is proposed to produce a more accurate adjustment factor for improving the fusion of two predictions of the two learners.Second,the prototype feature optimization module(PFOM)is proposed to provide an interaction between the two learners,which enhances the ability to distinguish the base classes from the target class by introducing contrastive learning loss.Extensive experiments have demonstrated that our method improves on the PASCAL-5i under 1-shot and 5-shot settings,respectively. 展开更多
关键词 semantic segmentation few-shot learning Meta learning Contrastive learning Self-attention
原文传递
FISS GAN:A Generative Adversarial Network for Foggy Image Semantic Segmentation 被引量:14
15
作者 Kunhua Liu Zihao Ye +3 位作者 Hongyan Guo Dongpu Cao Long Chen Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第8期1428-1439,共12页
Because pixel values of foggy images are irregularly higher than those of images captured in normal weather(clear images),it is difficult to extract and express their texture.No method has previously been developed to... Because pixel values of foggy images are irregularly higher than those of images captured in normal weather(clear images),it is difficult to extract and express their texture.No method has previously been developed to directly explore the relationship between foggy images and semantic segmentation images.We investigated this relationship and propose a generative adversarial network(GAN)for foggy image semantic segmentation(FISS GAN),which contains two parts:an edge GAN and a semantic segmentation GAN.The edge GAN is designed to generate edge information from foggy images to provide auxiliary information to the semantic segmentation GAN.The semantic segmentation GAN is designed to extract and express the texture of foggy images and generate semantic segmentation images.Experiments on foggy cityscapes datasets and foggy driving datasets indicated that FISS GAN achieved state-of-the-art performance. 展开更多
关键词 Edge GAN foggy images foggy image semantic segmentation GAN semantic segmentation
下载PDF
End-to-end dilated convolution network for document image semantic segmentation 被引量:8
16
作者 XU Can-hui SHI Cao CHEN Yi-nong 《Journal of Central South University》 SCIE EI CAS CSCD 2021年第6期1765-1774,共10页
Semantic segmentation is a crucial step for document understanding.In this paper,an NVIDIA Jetson Nano-based platform is applied for implementing semantic segmentation for teaching artificial intelligence concepts and... Semantic segmentation is a crucial step for document understanding.In this paper,an NVIDIA Jetson Nano-based platform is applied for implementing semantic segmentation for teaching artificial intelligence concepts and programming.To extract semantic structures from document images,we present an end-to-end dilated convolution network architecture.Dilated convolutions have well-known advantages for extracting multi-scale context information without losing spatial resolution.Our model utilizes dilated convolutions with residual network to represent the image features and predicting pixel labels.The convolution part works as feature extractor to obtain multidimensional and hierarchical image features.The consecutive deconvolution is used for producing full resolution segmentation prediction.The probability of each pixel decides its predefined semantic class label.To understand segmentation granularity,we compare performances at three different levels.From fine grained class to coarse class levels,the proposed dilated convolution network architecture is evaluated on three document datasets.The experimental results have shown that both semantic data distribution imbalance and network depth are import factors that influence the document’s semantic segmentation performances.The research is aimed at offering an education resource for teaching artificial intelligence concepts and techniques. 展开更多
关键词 semantic segmentation document images deep learning NVIDIA jetson nano
下载PDF
Improved Denoising Autoencoder for Maritime Image Denoising and Semantic Segmentation of USV 被引量:3
17
作者 Yuhang Qiu Yongcheng Yang +3 位作者 Zhijian Lin Pingping Chen Yang Luo Wenqi Huang 《China Communications》 SCIE CSCD 2020年第3期46-57,共12页
Unmanned surface vehicle(USV)is currently a hot research topic in maritime communication network(MCN),where denoising and semantic segmentation of maritime images taken by USV have been rarely studied.The former has r... Unmanned surface vehicle(USV)is currently a hot research topic in maritime communication network(MCN),where denoising and semantic segmentation of maritime images taken by USV have been rarely studied.The former has recently researched on autoencoder model used for image denoising,but the existed models are too complicated to be suitable for real-time detection of USV.In this paper,we proposed a lightweight autoencoder combined with inception module for maritime image denoising in different noisy environments and explore the effect of different inception modules on the denoising performance.Furthermore,we completed the semantic segmentation task for maritime images taken by USV utilizing the pretrained U-Net model with tuning,and compared them with original U-Net model based on different backbone.Subsequently,we compared the semantic segmentation of noised and denoised maritime images respectively to explore the effect of image noise on semantic segmentation performance.Case studies are provided to prove the feasibility of our proposed denoising and segmentation method.Finally,a simple integrated communication system combining image denoising and segmentation for USV is shown. 展开更多
关键词 USV DENOISING autoencoder semantic segmentation U-Net
下载PDF
Semantic segmentation method of road scene based on Deeplabv3+ and attention mechanism 被引量:6
18
作者 BAI Yanqiong ZHENG Yufu TIAN Hong 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2021年第4期412-422,共11页
In the study of automatic driving,understanding the road scene is a key to improve driving safety.The semantic segmentation method could divide the image into different areas associated with semantic categories in acc... In the study of automatic driving,understanding the road scene is a key to improve driving safety.The semantic segmentation method could divide the image into different areas associated with semantic categories in accordance with the pixel level,so as to help vehicles to perceive and obtain the surrounding road environment information,which would improve driving safety.Deeplabv3+is the current popular semantic segmentation model.There are phenomena that small targets are missed and similar objects are easily misjudged during its semantic segmentation tasks,which leads to rough segmentation boundary and reduces semantic accuracy.This study focuses on the issue,based on the Deeplabv3+network structure and combined with the attention mechanism,to increase the weight of the segmentation area,and then proposes an improved Deeplabv3+fusion attention mechanism for road scene semantic segmentation method.First,a group of parallel position attention module and channel attention module are introduced on the Deeplabv3+encoding end to capture more spatial context information and high-level semantic information.Then,an attention mechanism is introduced to restore the spatial detail information,and the data shall be normalized in order to accelerate the convergence speed of the model at the decoding end.The effects of model segmentation with different attention-introducing mechanisms are compared and tested on CamVid and Cityscapes datasets.The experimental results show that the mean Intersection over Unons of the improved model segmentation accuracies on the two datasets are boosted by 6.88%and 2.58%,respectively,which is better than using Deeplabv3+.This method does not significantly increase the amount of network calculation and complexity,and has a good balance of speed and accuracy. 展开更多
关键词 autonomous driving road scene semantic segmentation Deeplabv3+ attention mechanism
下载PDF
A Lane Detection Method Based on Semantic Segmentation 被引量:3
19
作者 Ling Ding Huyin Zhang +2 位作者 Jinsheng Xiao Cheng Shu Shejie Lu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2020年第3期1039-1053,共15页
This paper proposes a novel method of lane detection,which adopts VGG16 as the basis of convolutional neural network to extract lane line features by cavity convolution,wherein the lane lines are divided into dotted l... This paper proposes a novel method of lane detection,which adopts VGG16 as the basis of convolutional neural network to extract lane line features by cavity convolution,wherein the lane lines are divided into dotted lines and solid lines.Expanding the field of experience through hollow convolution,the full connection layer of the network is discarded,the last largest pooling layer of the VGG16 network is removed,and the processing of the last three convolution layers is replaced by hole convolution.At the same time,CNN adopts the encoder and decoder structure mode,and uses the index function of the maximum pooling layer in the decoder part to upsample the encoder in a counter-pooling manner,realizing semantic segmentation.And combined with the instance segmentation,and finally through the fitting to achieve the detection of the lane line.In addition,the currently disclosed lane line data sets are relatively small,and there is no distinction between lane solid lines and dashed lines.To this end,our work made a lane line data set for the lane virtual and real identification,and based on the proposed algorithm effective verification of the data set achieved by the increased segmentation.The final test shows that the proposed method has a good balance between lane detection speed and accuracy,which has good robustness. 展开更多
关键词 CNN VGG16 semantic segmentation instance segmentation lane detection
下载PDF
Artificial Intelligence-Based Semantic Segmentation of Ocular Regions for Biometrics and Healthcare Applications 被引量:4
20
作者 Rizwan Ali Naqvi Dildar Hussain Woong-Kee Loh 《Computers, Materials & Continua》 SCIE EI 2021年第1期715-732,共18页
Multiple ocular region segmentation plays an important role in different applications such as biometrics,liveness detection,healthcare,and gaze estimation.Typically,segmentation techniques focus on a single region of ... Multiple ocular region segmentation plays an important role in different applications such as biometrics,liveness detection,healthcare,and gaze estimation.Typically,segmentation techniques focus on a single region of the eye at a time.Despite the number of obvious advantages,very limited research has focused on multiple regions of the eye.Similarly,accurate segmentation of multiple eye regions is necessary in challenging scenarios involving blur,ghost effects low resolution,off-angles,and unusual glints.Currently,the available segmentation methods cannot address these constraints.In this paper,to address the accurate segmentation of multiple eye regions in unconstrainted scenarios,a lightweight outer residual encoder-decoder network suitable for various sensor images is proposed.The proposed method can determine the true boundaries of the eye regions from inferior-quality images using the high-frequency information flow from the outer residual encoder-decoder deep convolutional neural network(called ORED-Net).Moreover,the proposed ORED-Net model does not improve the performance based on the complexity,number of parameters or network depth.The proposed network is considerably lighter than previous state-of-theart models.Comprehensive experiments were performed,and optimal performance was achieved using SBVPI and UBIRIS.v2 datasets containing images of the eye region.The simulation results obtained using the proposed OREDNet,with the mean intersection over union score(mIoU)of 89.25 and 85.12 on the challenging SBVPI and UBIRIS.v2 datasets,respectively. 展开更多
关键词 semantic segmentation ocular regions biometric for healthcare sensors deep learning
下载PDF
上一页 1 2 141 下一页 到第
使用帮助 返回顶部