摘要
在传统的机器学习中,模型的准确度往往由已标记的数据样本规模所决定。但是在实际情况中,海量数据中往往只有极小部分获得了准确标记,而大部分数据未经标记,如果通过专业人员对数据逐个进行标记,将耗费大量的时间成本和经济成本。主动学习是从大量未标记的数据集中检索出最有用的未标记数据,交由专业人员进行标记,然后用该类样本来训练模型以期提高模型的准确率。本文设计一种对遥感图像的目标检测的方法,首先构建一个深度学习网络模型,通过使用已标注数据对该模型进行预训练,然后使用度量学习的技术,筛选出未标注数据集中的最有标注价值的图像数据进行标注,对此过程反复迭代,直至准确率达到设置的阈值。实验分别由已标注数据占总数据量的14.2%、21.4%、28.6%这3种数据标记量对该方法进行测试,结果表明,通过主动学习结合U-Net网络的方法,可以有效地减少数据的标记量而达到模型的预期效果。
In traditional machine learning,the accuracy of the model is often determined by the size of the labeled data sample.However,in the actual situation,only a small part of the massive data is usually accurately marked,while most of the data is not marked.If the data is marked one by one by professionals,it will cost a lot of time and economic costs.Active learning is to retrieve the most useful unlabeled data from a large number of unlabeled data sets,hand it over to professionals for labeling,and then train the model with such samples so as to improve the accuracy of the model.This paper designs a target detection method of remote sensing images.Firstly,a deep learning network model is constructed and pretrained by using the labeled data.This process is iterated repeatedly until the accuracy reaches the set threshold.In the experiment,the labeled data account for 14.2%,21.4%and 28.6%of the total data respectively.The experimental results show that this method of combining active learing with U-Net network can effectively reduce the amount of data labeling so as to achieve the expected effect of the model.
作者
屈晓渊
张永恒
QU Xiao-yuan;ZHANG Yong-heng(School of Information Engineering, Yulin University, Yulin 719000, China)
出处
《计算机与现代化》
2021年第11期50-55,60,共7页
Computer and Modernization
基金
国家自然科学基金资助项目(72061030)
陕西省重点研发计划项目(2019NY-179)。
关键词
主动学习
目标检测
遥感图像
地块分类
active learning
target detection
remote sensing image
land classification