融合语义—表观特征的无监督前景分割被引量：1

Semantic-apparent feature-fusion-based unsupervised foreground segmentation method

导出

摘要目的前景分割是图像理解领域中的重要任务,在无监督条件下,由于不同图像、不同实例往往具有多变的表达形式,这使得基于固定规则、单一类型特征的方法很难保证稳定的分割性能。针对这一问题,本文提出了一种基于语义—表观特征融合的无监督前景分割方法(semantic apparent feature fusion,SAFF)。方法基于语义特征能够对前景物体关键区域产生精准的响应,但往往产生的前景分割结果只关注于关键区域,缺乏物体的完整表达;而以显著性、边缘为代表的表观特征则提供了更丰富的细节表达信息,但基于表观规则无法应对不同的实例和图像成像模式。为了融合表观特征和语义特征优势,研究建立了融合语义、表观信息的一元区域特征和二元上下文特征编码的方法,实现了对两种特征表达的全面描述。接着,设计了一种图内自适应参数学习的方法,用于计算最适合的特征权重,并生成前景置信分数图。进一步地,使用分割网络来学习不同实例间前景的共性特征。结果通过融合语义和表观特征并采用图像间共性语义学习的方法,本文方法在PASCAL VOC(pattern analysis,statistical modelling and computational learning visual object classes)2012训练集和验证集上取得了显著超过类别激活映射(class activation mapping,CAM)和判别性区域特征融合方法(discriminative regional feature integration,DRFI)的前景分割性能,在F测度指标上分别提升了3.5%和3.4%。结论本文方法可以将任意一种语义特征和表观特征前景计算模块作为基础单元,实现对两种策略的融合优化,取得了更优的前景分割性能。 Objective Foreground segmentation is an essential research in the field of image understanding,which is a preprocessing step for saliency object detection,semantic segmentation,and various pixel-level learning tasks.Given an image,this task aims to provide each pixel a foreground or background annotation.For fully supervision-based methods,satisfactory results can be achieved via multi-instance-based learning.However,when facing the problem under unsupervised conditions,achieving a stable segmentation performance based on fixed rules or a single type of feature is difficult because different images and instances always have variable expressions.Moreover,we find that different types of method have different advantages and disadvantages on different aspects.On the one hand,semantic feature-based learning methods could provide accurate key region extraction of foregrounds but could not generate complete object region and edges in detail.On the other hand,richer detailed expression can be obtained based on an apparent feature-based framework,but it cannot be suitable for variable kinds of cases.Method Based on the observations,we propose an unsupervised foreground segmentation method based on semantic-apparent feature fusion.First,given a sample,we encode it as semantic and apparent feature map.We use a class activation mapping model pretrained on Image Net for semantic heat map generation and select saliency and edge maps to express the apparent feature.Each kind of semantic and apparent feature can be used,and the established framework is widely adaptive for each case.Second,to combine the advantages of the two type of features,we split the image as super pixels,and set the expression of four elements as unary and binary semantic and apparent feature,which realizes a comprehensive description of the two types of expressions.Specifically,we build two binary relation matrices to measure the similarity of each pair of super pixels,which are based on apparent and semantic feature.For generating the binary semantic feature,we use the apparent feature-based similarity measure as a weight to provide the element for each super pixel,in which semantic-feature-based similarity measure is utilized for binary apparent feature calculation.Based on the different view for feature encoding,the two types of information could be fused for the first time.Then,we propose a method for adaptive parameter learning to calculate the most suitable feature weights and generate the foreground confidence score map.Based on the four elements,we could establish an equation to express each super pixel’s foreground confidence score using the least squares method.For an image,we first select super pixels with higher confident scores of unary semantic and apparent feature on foreground or background.Then,we can learn weights of the four elements and bias’linear combination by least squares estimation.Based on the adaptive parameters,we can achieve a better confidence score inference for each super pixel individually.Third,we use segmentation network to learn foreground common features from different instances.In a weakly supervised semantic segmentation task,the fully supervision-based framework is used for improving pseudo annotations for training data and providing inference results.Inspired by the idea,we use the convolution network to mine foreground common feature from different instances.The trained model could be utilized to optimize the quality of foreground segmentation for both images used for network training and new data directly.A better performance can be achieved by fusing semantic and apparent features as well as cascading the modules of intra image adaptive feature weight learning and inter-image common feature learning.Result We test our methods on the pattern analysis,statistical modelling and computational learning visual object classes(PASCAL VOC)2012 training and evaluation set,which include 10582 and 1449 samples,respectively.Precision-recall curve as well as F-measure are used as indicators to evaluate the experimental results.Compared with typical semantic and apparent feature-based foreground segmentation methods,the proposed framework achieves superior improvement of baselines.For PASCAL VOC 2012 training set,the F-measure has a 3.5%improvement,while a 3.4%increase is obtained on the validation set.We also focus on the performance on visualized results for analysis the advantages of fusion framework.Based on comparison,we can find that results with accurate,detailed expression can be achieved based on the adaptive feature fusion operation,while incorrect cases can further be modified via multi-instance-based learning framework.Conclusion In this study,we propose a semantic-apparent feature fusion method for unsupervised foreground segmentation.Given an image as input,we first calculate the semantic and apparent feature of the unary region of each super pixel in image.Then,we integrate two types of features through the cross-use of similarity measure of apparent and semantic feature.Next,we establish a context relationship for each pair of super pixels to calculate the binary feature of each region.Further,we establish an adaptive weight learning strategy.We obtain the weighting parameters for optimal foreground segmentation and achieve the confidence in the image foreground by automatically adjusting the influence of each dimensional feature on the foreground estimation in each specific image instance.Finally,we build a foreground segmentation network model to learn the common features of foreground between different instances and samples.Using the trained network model,the image can be re-inferred to obtain more accurate foreground segmentation results.The experiments on the PASCAL VOC 2012 training set and validation set prove the effectiveness and generalization ability of the algorithm.Moreover,the method proposed can use other foreground segmentation methods as a baseline and is widely used to improve the performance of tasks such as foreground segmentation and weakly supervised semantic segmentation.We also believe that to consider the introduction of various types of semantic and apparent feature fusion as well as adopt alternate iterations to mine the internal spatial context information of image and the common expression features between different instance is a feasible way to improve the performance of foreground segmentation further and an important idea for semantic segmentation tasks.

作者李熹马惠敏马洪兵王弈冬 Li Xi;Ma Huimin;Ma Hongbing;Wang Yidong(Tsinghua University,Beijing 100084,China;University of Science and Technology Beijing,Beijing 100083,China;Xinjiang University,Urumqi 830046,China)

机构地区清华大学北京科技大学新疆大学

出处《中国图象图形学报》 CSCD 北大核心 2021年第10期2503-2513,共11页 Journal of Image and Graphics

基金国家自然科学基金项目(U20B2062 61773231) 国家重点研发计划项目(2016YFB0100901) 北京市科学技术项目(Z191100007419001)。

关键词计算机视觉前景分割无监督学习语义—表观特征融合自然场景图像 PASCAL VOC数据集自适应加权 computer vision foreground segmentation unsupervised learning semantic-apparent feature fusion natural scene images PASCAL VOC dataset adaptive weighting

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

同被引文献6

1张巧荣,徐国愚,张俊峰.利用视觉显著性的前景目标分割[J].兰州大学学报（自然科学版）,2019,55(6):833-840. 被引量：2
2牛杰,卜雄洙,钱堃.基于前景分割的目标实时检测方法[J].计算机应用,2014,34(5):1463-1466. 被引量：2
3翟玲,朱敏,戴李君.基于超像素与特征改进的Grab cut前景分割[J].微型电脑应用,2015,31(11):48-50. 被引量：3
4薛萍.基于超像素特征表示的图像前景背景分割算法[J].西安科技大学学报,2017,37(5):731-735. 被引量：8
5陈亚当,郝川艳.动态双边网格实现的视频前景分割算法[J].计算机辅助设计与图形学学报,2018,30(11):2101-2107. 被引量：3
6李磊,孙佳伟.神经网络与边缘检测相结合的人体前景分割算法[J].计算机与数字工程,2020,48(4):940-945. 被引量：3

引证文献1

1魏宗琪,梁栋.视频中稳定的跨场景前景分割[J].计算机技术与发展,2022,32(12):37-42.

1邹聪,梁永全.基于YOLO V3算法的输电线路鸟类检测[J].计算机应用与软件,2021,38(10):164-167. 被引量：9
2杨立东,胡江涛.多优化机制下深度神经网络的音频场景识别[J].信号处理,2021,37(10):1969-1976. 被引量：2
3李楚为,张志龙,杨卫平.结合布尔图和灰度稀缺性的小目标显著性检测[J].中国图象图形学报,2020,0(2):267-281. 被引量：4
4韩旭,赵春江,吴华瑞,朱华吉,张燕.基于注意力机制及多尺度特征融合的番茄叶片缺素图像分类方法[J].农业工程学报,2021,37(17):177-188. 被引量：19
5Mohita Maurya,Manoj Kumar Barthwal.MicroRNA-99a: a potential double-edged sword targeting macrophage inflammation and metabolism[J].Cellular & Molecular Immunology,2021,18(9):2290-2292.
6Chak Kwong Cheng,Jiang-Yun Luo,Chi Wai Lau,William Chi-shing Cho,Chi Fai Ng,Ronald Ching Wan Ma,Xiao Yu Tian,Yu Huang.A GLP-1 analog lowers ER stress and enhances protein folding to ameliorate homocysteine-induced endothelial dysfunction[J].Acta Pharmacologica Sinica,2021,42(10):1598-1609. 被引量：7
7郭文明,王腾亿.类激活映射指导数据增强的细粒度图像分类[J].计算机辅助设计与图形学学报,2021,33(11):1698-1704. 被引量：2
8马艳如,石晓荣,刘华华,梁小辉,王青.运载火箭姿态系统自适应神经网络容错控制[J].宇航学报,2021,42(10):1237-1245. 被引量：9
9刘毅,王磊,王胜利,胡亮亮,高文龙.改进的布料模拟算法在多波束点云粗差剔除中的应用[J].科学技术与工程,2021,21(31):13248-13253. 被引量：4
10LIU Sheng,SHEN Jiayu,HUANG Shengyue.Object detection in seriously degraded images with unbalanced training samples[J].Optoelectronics Letters,2021,17(9):564-571.

中国图象图形学报

2021年第10期

浏览历史

内容加载中请稍等...

融合语义—表观特征的无监督前景分割被引量：1

同被引文献6

引证文献1

相关作者

相关机构

相关主题

浏览历史

融合语义—表观特征的无监督前景分割 被引量：1

同被引文献6

引证文献1

相关作者

相关机构

相关主题

浏览历史

融合语义—表观特征的无监督前景分割被引量：1