Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block,and it plays a crucial role in environmental perception...Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block,and it plays a crucial role in environmental perception.Conventional learning-based visual semantic segmentation approaches count heavily on largescale training data with dense annotations and consistently fail to estimate accurate semantic labels for unseen categories.This obstruction spurs a craze for studying visual semantic segmentation with the assistance of few/zero-shot learning.The emergence and rapid progress of few/zero-shot visual semantic segmentation make it possible to learn unseen categories from a few labeled or even zero-labeled samples,which advances the extension to practical applications.Therefore,this paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space and explores the commonalities and discrepancies of technical settlements under different segmentation circumstances.Specifically,the preliminaries on few/zeroshot visual semantic segmentation,including the problem definitions,typical datasets,and technical remedies,are briefly reviewed and discussed.Moreover,three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation,including image semantic segmentation,video object segmentation,and 3D segmentation.Finally,the future challenges of few/zero-shot visual semantic segmentation are discussed.展开更多
基金supported by National Key Research and Development Program of China(2021YFB1714300)the National Natural Science Foundation of China(62233005)+2 种基金in part by the CNPC Innovation Fund(2021D002-0902)Fundamental Research Funds for the Central Universities and Shanghai AI Labsponsored by Shanghai Gaofeng and Gaoyuan Project for University Academic Program Development。
文摘Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block,and it plays a crucial role in environmental perception.Conventional learning-based visual semantic segmentation approaches count heavily on largescale training data with dense annotations and consistently fail to estimate accurate semantic labels for unseen categories.This obstruction spurs a craze for studying visual semantic segmentation with the assistance of few/zero-shot learning.The emergence and rapid progress of few/zero-shot visual semantic segmentation make it possible to learn unseen categories from a few labeled or even zero-labeled samples,which advances the extension to practical applications.Therefore,this paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space and explores the commonalities and discrepancies of technical settlements under different segmentation circumstances.Specifically,the preliminaries on few/zeroshot visual semantic segmentation,including the problem definitions,typical datasets,and technical remedies,are briefly reviewed and discussed.Moreover,three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation,including image semantic segmentation,video object segmentation,and 3D segmentation.Finally,the future challenges of few/zero-shot visual semantic segmentation are discussed.