期刊文献+

融合NetVLAD和全连接层的三元神经网络交叉视角场景图像定位 被引量:9

Cross-view scene image localization with Triplet Network integrating NetVLAD and Fully Connected Layers
原文传递
导出
摘要 研究场景图像的地理定位问题在室外定位、目标搜寻、军事侦察等领域具有重要意义。针对街景影像与鸟瞰影像之间的交叉视角场景图像匹配与定位问题,本文提出了一种融合可训练局部聚集描述子向量Net VLAD(Net Vector of locally aggregated descriptors)和全连接层的三元神经网络(Triplet Network)定位方法(Tri-Net VLAD)。三元神经网络由三组卷积神经网络CNN(Convolutional Neural Networks)构成,能同时处理3张影像,通过增大不匹配像对间的距离,减小匹配像对间的距离,实现图像检索与匹配;Net VLAD和全连接层的融合可以加强特征间的关联性。本文将CNN提取的局部卷积特征分别通过Net VLAD层和全连接层得到全局描述符与特征向量,并将二者融合,有效地提升了局部特征间的关联性,并保留了不同局部特征之间的差异性,提升了模型的定位精度;改进了DBL loss(Distance-based layer loss),通过加入参数λ增强函数判别困难样本的能力,在提升模型的收敛速度和稳定性的同时也提升了模型的定位精度。在美国Vo and Hays公开数据集上的实验结果表明,Tri-Net VLAD取得了优于MCVPlaces、Triplet e DBL-Net和CVM-Net等现有方法的定位精度,在测试集上的精度高于63%。 Cross-view scene image matching and positioning have a wide range of applications in target search,combating crime,and positioning.With the development of deep learning,neural networks have played an important role in this issue.Given the problem of crossview scene image matching and positioning between street view and bird’s eye images,the neural network model’s convergence is slow,and the feature correlation is weak.This paper proposes a triplet network model(Tri-NetVLAD)that combines NetVLAD and a fully connected layer and improves DBL Loss(ADBL loss).The proposed method can not only improve the convergence speed and stability of the network but also the overall positioning accuracy of the model.The proposed Tri-NetVLAD model extracts the local features of the three input images through a triplet network and inputs the local features to the fully connected and NetVLAD layers to obtain the feature vector and the global feature descriptor.The global feature descriptor can obtain the relative distribution between features,and on this basis,incorporate feature vectors,which can preserve the differences between features to improve the positioning accuracy of the model.ADBL loss improves the model’s ability to discriminate difficult samples by introducing parameters and the positioning accuracy of the model.The proposed Tri-NetVLAD is compared with several existing methods,namely,MCVPlaces,Triplet eDBL-Net,and CVM-Net,and loss functions,namely,contrastive loss,triplet loss,and DBL loss.In the US vo and hays dataset,the highest positioning accuracy of 63.5%is achieved,proving that the triplet network that combines the NetVLAD and fully connected layers can effectively improve the positioning accuracy with the ADBL Loss.Compared with existing methods,the proposed Tri-NetVLAD has the following advantages.(1)The Triplet network can increase the Euclidean distance between unmatched images while reducing the Euclidean distance between matched images.(2)The introduction of NetVLAD can aggregate the local features extracted by CNN to obtain global feature descriptors and the distribution relationship between features.(3)The fusing of the Fully Connected Layer adds the feature vector obtained through the fully connected layer to the global feature descriptor,so that the final feature vector not only represents the distribution relationship between features,but also retains the differences between features.(4)The improved loss function ADBL Loss can accelerate the gradient convergence speed and improve the overall positioning accuracy.
作者 薛朝辉 周逸飏 强永刚 刘弋锋 林晖 XUE Zhaohui;ZHOU Yiyang;QIANG Yonggang;LIU Yifeng;LIN Hui(School of Earth Sciences and Engineering,Hohai University,Nanjing 211100,China;School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026,China;National Engineering Laboratory for Social Security Risk Perception and Prevention and Control of Big Data Application,China Academy of Electronics,Beijing 100041,China)
出处 《遥感学报》 EI CSCD 北大核心 2021年第5期1095-1107,共13页 NATIONAL REMOTE SENSING BULLETIN
基金 国家自然科学基金(编号:41971279)。
关键词 交叉视角 场景图像匹配与定位 三元神经网络 Net VLAD CNN(Convolutional Neural Networks) cross-view scene image matching and geolocation Triplet Network Net VLAD CNN
  • 相关文献

参考文献3

二级参考文献43

  • 1鲁珂,赵继东,叶娅兰,曾家智.一种用于图像检索的新型半监督学习算法[J].电子科技大学学报,2005,34(5):669-671. 被引量:9
  • 2李德仁,宁晓刚.一种新的基于内容遥感图像检索的图像分块策略[J].武汉大学学报(信息科学版),2006,31(8):659-662. 被引量:16
  • 3Jensen J R. Introductory Digital Image Processing: A Remote Sensing Perspective[ M]. New Jersey: Prentice Hall, 1996.
  • 4Zadeh L A. Fuzzy Sets[ J]. Information and Control, 1965, 8(3) : 338-353.
  • 5Baraldi A, Blonda P. A Survey of Fuzzy Clustering Algorithms for Pattern Recognition: Part Ⅰ [ J ]. IEEE Trans. on Systems,Man, and Cybernetics: Part B: Cybernetics, 1999, 29(6):778-785.
  • 6Baraldi A, Blonda P. A Survey of Fuzzy Clustering Algorithms for Pattern Recognition : Part Ⅱ[ J ] . IEEE Trans. on Systems,Man, and Cybernetics: Part B: Cybernetics, 1999, 29(6):786-801.
  • 7Zhang J X, Foedy G M. A Fuzzy Classifcation of Sub-urban Land Cover from Remotely Sensed Imagery [ J ]. Int. J. Remote Sensing, 1998, 19(14) : 2721-2738.
  • 8Tso B, Mather P M. Classification Methods for Remotely Sensed Data[M]. Basingstoke: Taylor & Francis, 2001.
  • 9Richards A J, Jia X P. Remote Sensing Digital Image Analysis:An Introduction, 3rd Edition[M]. New York: Springer, 1999.
  • 10Wu F Y. The Potts Model [ J ]. Reviews of Modern Physics,1982, 54(1) : 235-268.

共引文献62

同被引文献77

引证文献9

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部