基于主动学习的图像分类技术:现状与未来被引量：3

Active Learning-Based Image Classification Technology:Status and Future

下载PDF

导出

摘要图像分类作为计算机视觉领域中的重要研究方向之一,应用领域非常广泛.基于深度学习的图像分类技术取得的成功,依赖大量的已标注数据,然而数据的标注成本往往是昂贵的.主动学习作为一种机器学习方法,旨在以尽可能少的高质量标注数据达到期望的模型性能,缓解监督学习任务中存在的标注成本高、标注信息难以大量获取的问题.主动学习图像分类算法根据样本选择策略,从未标记样本数据集合中选择出信息量丰富,对分类模型训练贡献更高的样本进行标注,以更新已标注训练数据池,如此循环直至满足给定的停止条件或模型标注预算耗尽.本文对近年来提出的主动学习图像分类算法进行了详细综述,并根据所用样本数据处理及模型优化方案,将现有算法分为三类:基于数据增强的算法,包括利用图像增广来扩充训练数据,或者根据图像特征插值后的差异性来选择高质量的训练数据;基于数据分布信息的算法,根据数据分布的特点来优化样本选择策略;优化模型预测的算法,包括优化获取和利用深度模型预测信息的方法、基于生成对抗网络和强化学习来优化预测模型的结构,以及基于Transformer结构提升模型预测性能,以确保模型预测结果的可靠性.此外,本文还对各类主动学习图像分类算法下的重要学术工作进行了实验对比,并对各算法在不同规模数据集上的性能和适应性进行了分析.另外,本文探讨了主动学习图像分类技术所面临的挑战,并指出了未来研究的方向. As one of the important research directions in the field of computer vision,image classification has a wide range of applications.The success of deep learning-based image classification techniques depends on a large amount of an⁃notated data.However,the cost of data annotation is often expensive.Active learning is a machine learning method that aims to achieve the expected model performance with as few high-quality annotated data as possible,and it can alleviate the problem of high annotation costs and difficulty in obtaining a large amount of annotation information in supervised learning tasks.Based on a sample selection strategy,active learning for image classification selects samples from the unlabeled data⁃set which are informative and thus contribute more to the training of the classification model,in order to update the annotat⁃ed training data pool.This process is repeated until a given stopping condition is met or the model annotation budget is ex⁃hausted.This paper provides a comprehensive survey of the active learning image classification algorithms published in re⁃cent years.According the strategies applied in sample data processing and model structure optimization,existing algo⁃rithms are classified into three categories:algorithms based on data augmentation,including those using image augmenta⁃tion to expand the scale of training data or using the differences in image feature interpolation to select high-quality training data;algorithms based on data distribution information,which optimize sample selection strategies based on the characteristics of data distribution;algorithms for optimizing model predictions,including methods for optimizing the acquisition and utilization of deep model prediction information,improving the predictive model structure through the use of generative ad⁃versarial networks and reinforcement learning,as well as enhancing model prediction performance based on the Transform⁃er architecture to ensure the reliability of model predictions.In addition,this paper also conducts experimental comparisons on important academic work under various types of active learning image classification algorithms,and analyzes the perfor⁃mance and adaptability of each algorithm on datasets of different scales.Furthermore,this paper discusses the challenges faced by active learning image classification technology and points out future research directions.

作者刘颖庞羽良张伟东李大湘许志杰 LIU Ying;PANG Yu-liang;ZHANG Wei-dong;LI Da-xiang;XU Zhi-jie(Center for Image and Information Processing,Xi'an University of Posts and Telecommunications,Xi'an,Shaanxi 710121,China;International Joint-Research Center for Wireless Communication and Information Processing,Xi'an,Shaanxi 710121,China;Huddersfield University,West Yorkshire HD13DH,United Kingdom of Great Britain and Northern Ireland)

机构地区西安邮电大学图像与信息处理研究所无线通信与信息处理技术国际联合研究中心英国哈德斯菲尔德大学

出处《电子学报》 EI CAS CSCD 北大核心 2023年第10期2960-2984,共25页 Acta Electronica Sinica

基金国家自然科学基金青年项目(No.62106195)。

关键词图像分类主动学习数据增强数据分布模型预测信息模型结构优化 image classification active learning data augmentation data distribution model prediction information model structure optimization

分类号 TP751 [自动化与计算机技术—检测技术与自动化装置] TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

同被引文献22

1姚红革,王诚,喻钧,白小军,李蔚.复杂卫星图像中的小目标船舶识别[J].遥感学报,2020,24(2):116-125. 被引量：16
2周飞燕,金林鹏,董军.卷积神经网络研究综述[J].计算机学报,2017,40(6):1229-1251. 被引量：1736
3郭莹,李伦,王鹏.基于Lanczos核的实时图像插值算法[J].通信学报,2017,38(6):142-147. 被引量：5
4郑远攀,李广阳,李晔.深度学习在图像识别中的应用研究综述[J].计算机工程与应用,2019,55(12):20-36. 被引量：388
5马啸,邵利民,金鑫,徐冠雷.改进的YOLO模型及其在舰船目标识别中的应用[J].电讯技术,2019,59(8):869-874. 被引量：16
6张筱晗,姚力波,吕亚飞,韩鹏,李健伟.基于中心点的遥感图像多方向舰船目标检测[J].光子学报,2020,49(4):205-213. 被引量：14
7谭章禄,陈孝慈.改进的分类器分类性能评价指标研究[J].统计与信息论坛,2020,35(9):3-8. 被引量：15
8王伟.基于遥感图像的船舶目标检测方法综述[J].电讯技术,2020,60(9):1126-1132. 被引量：12
9徐安林,杜丹,王海红,张强,李雅哲.结合层次化搜索与视觉残差网络的光学舰船目标检测方法[J].光电工程,2021,48(4):36-43. 被引量：4
10周航,黄春光,程海.基于全局多粒度池化的可见光红外行人重识别[J].电子测量技术,2022,45(1):122-128. 被引量：4

引证文献3

1李苇,傅罡,李强.基于主动学习的大规模安检图像资源库构建[J].无线互联科技,2024,21(13):22-24.
2姜文涛,高原,袁姮,刘万军.门控机制的图像分类网络[J].电子学报,2024,52(7):2393-2406.
3曾富强,张贞凯,方梦瑶.基于改进ResNet18的遥感图像舰船目标识别[J].电子测量技术,2024,47(12):164-172.

1孙梦琪,倪广林,张培.基于数字孪生和智能感知的虚拟分流技术研究[J].电子设计工程,2023,31(24):65-69. 被引量：1
2侯青青,成华强,朱敏,杨轩,夏方山.晋北两种饲草作物的APSIM模型参数敏感性分析[J].草地学报,2023,31(10):3114-3122. 被引量：1
3杨延云,杜建强,聂斌,罗计根,贺佳.融合数据增强和注意力机制的中医实体及关系联合抽取[J].智能计算机与应用,2023,13(8):186-191.
4王平辉.当制造业遇上知识图谱[J].科技纵览,2023(10):56-57.
5李伟,黄鹤鸣.基于双交叉熵的自适应残差卷积图像分类算法[J].计算机工程与设计,2023,44(12):3670-3676. 被引量：1
6吴婕,张海翔.基于多重并联图神经网络的小样本图像分类算法[J].计算机时代,2023(12):40-43.
7杨学,严骏驰.基于特征对齐和高斯表征的视觉有向目标检测[J].中国科学：信息科学,2023,53(11):2250-2265. 被引量：2
8陈德实.基于集成学习策略的高速公路气象识别[J].广东公路交通,2023,49(6):71-76.
9彭小燕,唐洁.“立人”之思及其实践意志——“王得后鲁迅”的核心维度[J].太原学院学报（社会科学版）,2024,25(1):96-103.
10岑逾豪.学生发展视角下研究生同伴评审的实践与研究[J].教学研究,2023,46(6):11-18.

电子学报

2023年第10期

浏览历史

内容加载中请稍等...

基于主动学习的图像分类技术:现状与未来被引量：3

同被引文献22

引证文献3

相关作者

相关机构

相关主题

浏览历史

基于主动学习的图像分类技术:现状与未来 被引量：3

同被引文献22

引证文献3

相关作者

相关机构

相关主题

浏览历史

基于主动学习的图像分类技术:现状与未来被引量：3