一种基于选权迭代的样本数据自动清洗方法

An Automatic Sample Data Cleaning Method Based on Weight Iteration

下载PDF

导出

摘要基于数字线划图和真正射影像可以自动生成大量满足深度学习要求的样本数据,但是其中存在部分错误信息,会导致神经网络模型训练难度加大,并且限制地物提取精度提升。因此,提出一种基于选权迭代的样本数据自动清洗方法,首先构建数据清洗深度神经网络模型,并给出基于选权迭代的网络训练方法,该方法打破传统网络模型训练时认为所有样本对损失函数计算权重相同的假设,利用数据清洗网络模型训练过程中样本预测精度作为样本权重并带入网络训练中,再通过迭代训练不断更新样本权重,最终将权重低的样本剔除,以实现自动数据清洗和样本库精化。使用数据清洗前后的样本库,对5种经典语义分割网络模型进行训练和精度对比实验,结果表明,利用数据清洗后的样本库训练模型,建筑提取精度平均提高2.36%,道路提取精度平均提高3.48%,水体提取精度平均提高1.88%,证明提出的数据清洗方法可以有效提高网络模型的提取精度。 Based on digital line graphic and true digital orthophoto map,a large amount of sample data that meets the requirements of deep learning can be automatically generated.However,there are often some erroneous information,which increases the difficulty of training neural network models and limits the improvement of ground feature extraction accuracy.A sample data automatic cleaning method based on selection weight iteration was proposed to address this issue.Firstly,a deep neural network model for data cleaning was constructed,and a network training method based on selection weight iteration was proposed.The method broke the assumption that all samples had the same weight for the calculation of loss function during network model training.The prediction accuracy of the samples during the data cleaning network model training process was used as the weight of the samples to be brought into the network training,and the sample weights were continuously updated through iterative training.Finally,samples with low weights were eliminated to achieve automatic data cleaning and sample database refinement.Training and accuracy comparison experiments were conducted on five classic semantic segmentation network models using a sample database before and after data cleaning.The results show that,the model trained using the sample database after data cleaning improves the average accuracy of building extraction by 2.36%,road extraction by 3.48%,and water extraction by 1.88%.This experiment proves that the data cleaning method proposed in this paper can effectively improve the accuracy of the network model in extracting ground features.

作者夏旺许诗旋童思奇 XIA Wang;XU Shixuan;TONG Siqi(China Railway Siyuan Survey and Design Group Co.,Ltd.,Wuhan 430063,China)

机构地区中铁第四勘察设计院集团有限公司

出处《铁道勘察》 2024年第4期85-91,共7页 Railway Investigation and Surveying

基金国家重点研发计划项目(2021YFB2600400) 中国铁建股份有限公司科技研发计划重点课题(2022-A02)。

关键词高速铁路数据清洗深度学习地物提取样本库 high-speed railway data cleaning deep learning object extraction sample database

分类号 U238 [交通运输工程—道路与铁道工程] P231 [天文地球—摄影测量与遥感]

引文网络
相关文献

1徐天扬,章浙涛,何秀凤,袁海军.一种适用于单频GNSS数据的多周跳探测与修复方法[J].武汉大学学报（信息科学版）,2024,49(3):465-472. 被引量：1
2李文威.DEM粗差权值衰减迭代探测算法研究[J].铁道勘察,2024,50(2):27-32.
3孙维维,刘杰,张芳芳,马海艺,王昌昆,潘贤章.基于高分辨率遥感影像应用BASS-Net构建化工园区典型地物识别模型[J].遥感技术与应用,2024,39(3):612-619.
4刘炜清,贾赫成.结合非局部注意和多层残差的遥感图像建筑物提取方法[J].上海航天（中英文）,2024,41(4):163-172.
5黄梦霞.基于车载激光扫描点云数据的杆状地物提取方法研究[J].测绘与空间地理信息,2024,47(8):165-167.
6杨兴元.地理信息系统技术在工程测量中的应用研究[J].城市建设理论研究（电子版）,2024(24):160-162.

铁道勘察

2024年第4期

浏览历史

内容加载中请稍等...

一种基于选权迭代的样本数据自动清洗方法

相关作者

相关机构

相关主题

浏览历史