摘要
随着深度模型在许多实际任务中的广泛应用,提高模型的鲁棒性已经成为了机器学习的重要研究方向。最新的研究表明,通过对训练样本添加噪声扰动进行训练能有效地提升深度模型的鲁棒性。然而,该训练过程往往需要大量已标注样本。在许多实际应用中,准确地标注每一个样本的标记信息往往代价高昂且异常困难。主动学习是降低样本标注代价的主要方法,通过主动选择最有价值的样本进行标注,在提高模型性能的同时,能最大限度地降低查询标记的代价。提出一种基于主动采样的鲁棒神经网络学习框架,该框架能以较低的标注代价显著地提升深度模型的鲁棒性。在该框架中,基于不一致性的主动采样方法通过生成系列扰动样本并采用其预测差异来衡量每个未标注样本对提升模型鲁棒性的潜在效用,同时挑选不一致性值最大的样本用于深度模型的加噪训练。在基准图像分类任务数据集上进行的实验表明,基于不一致性的主动采样策略能以更低的样本标注代价有效地提升深度神经网络模型的鲁棒性。
Recently,deep learning models have been widely used in various real-world tasks.Improving the robustness of deep neural networks has become an important research direction in machine learning field.Recent works show that training the deep model with noise perturbations can significantly improve the model robustness.However,its training requires a large set of precisely labeled examples,which is often expensive and difficult to collect in real-world scenario.Active learning(AL)is a primary approach for reducing the labeling cost,which progressively selects the most useful samples and queries their labels,with the target of training an effective model with less queries.This paper proposes an active sampling based neural network learning framework,which aims to improve the model robustness with low labeling cost.In this framework,the proposed inconsistency sampling strategy is employed to measure the potential utility for improving the model robustness of each unlabeled example with a series of perturbations.Then,those examples with the largest inconsistency will be selected for training the deep model with noise perturbations.Experimental results on the benchmark image classification task data set show that the inconsistency-based active sampling strategy can effectively improve the robustness of the deep neural network model with lower sample labeling cost.
作者
周慧
施皓晨
屠要峰
黄圣君
ZHOU Hui;SHI Hao-chen;TU Yao-feng;HUANG Sheng-jun(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China;MIIT Key Laboratory of Pattern Analysis and Machine Intelligence,Nanjing 211106,China;State Key Laboratory of Mobile Network and Mobile Multimedia Technology,Shenzhen,Guangdong 518057,China)
出处
《计算机科学》
CSCD
北大核心
2022年第7期164-169,共6页
Computer Science
基金
科技创新2030-新一代人工智能重大项目(2020AAA0107000)
国家自然科学基金(62076128)。
关键词
深度学习
噪声干扰
主动学习
模型鲁棒性
不一致性
Deep learning
Noise perturbations
Active learning
Model robustness
Inconsistency