摘要
现代神经网络可能会对来自训练分布之外的输入产生高置信度的预测结果,对机器学习模型构成潜在威胁。检测异常分布的输入是在现实世界中安全部署模型的核心问题。基于能量模型的检测方法,直接利用模型提取的特征向量计算样本的能量分数,而依赖并不重要的特征可能会影响检测的性能。为了解决该问题,提出了一种基于稀疏优化的损失函数。对已经预训练完成的分类模型进行微调,在学习过程中保持模型分类能力的同时,增加正常样本特征的稀疏程度,使得正常样本的能量分数降低,正常样本与异常样本之间的分数差异变大,从而提高检测效果。该方法并未引入异常的辅助数据集,避免了样本之间相关性的影响。在数据集CIFAR-10和CIFAR-100上的实验结果表明,该方法将检测6个异常数据集的平均FPR 95分别降低了15.02%和15.41%。
Modern neural networks may produce high confidence prediction results for inputs from outside the training distribution,posing a potential threat to machine learning models.Detecting inputs from out-of-distributions is a central issue in the safe deployment of models in the real world.Detection methods based on energy models directly use the feature vectors extracted by the model to calculate the energy score of a sample,and reliance on features that are not significant may affect the performance of the detection.To alleviate this problem,a loss function based on sparse regularization is proposed to fine-tune a classification model that has been pre-trained to increase the sparsity of in-distribution sample features while maintaining the classification power of the model during the learning process.This results in a lower energy score for in-distribution samples and a larger difference in scores between in-distribution and out-of-distribution samples,thus improving detection performance.Furthermore,the method does not introduce an external auxiliary dataset,avoiding the effect of correlation between samples.Experimental results on datasets CIFAR-10 and CIFAR-100 show that the method reduced the average FPR 95 of detecting the six abnormal datasets by 15.02%and 15.41%respectively.
作者
陈启超
李宽
CHEN Qichao;LI Kuan(School of Cyberspace Security,Dongguan University of Technology,Dongguan 523808,China)
出处
《桂林电子科技大学学报》
2023年第1期41-48,共8页
Journal of Guilin University of Electronic Technology
基金
国家自然科学基金(61876038)。
关键词
神经网络
异常分布检测
能量分数
微调
稀疏优化
neural network
out-of-distribution detection
energy score
fine-tuning
sparsity regularization