Transformer-based stereo image super-resolution reconstruction(Stereo SR)methods have significantly improved image quality.However,existing methods have deficiencies in paying attention to detailed features and do not...Transformer-based stereo image super-resolution reconstruction(Stereo SR)methods have significantly improved image quality.However,existing methods have deficiencies in paying attention to detailed features and do not consider the offset of pixels along the epipolar lines in complementary views when integrating stereo information.To address these challenges,this paper introduces a novel epipolar line window attention stereo image super-resolution network(EWASSR).For detail feature restoration,we design a feature extractor based on Transformer and convolutional neural network(CNN),which consists of(shifted)window-based self-attention((S)W-MSA)and feature distillation and enhancement blocks(FDEB).This combination effectively solves the problem of global image perception and local feature attention and captures more discriminative high-frequency features of the image.Furthermore,to address the problem of offset of complementary pixels in stereo images,we propose an epipolar line window attention(EWA)mechanism,which divides windows along the epipolar direction to promote efficient matching of shifted pixels,even in pixel smooth areas.More accurate pixel matching can be achieved using adjacent pixels in the window as a reference.Extensive experiments demonstrate that our EWASSR can reconstruct more realistic detailed features.Comparative quantitative results show that in the experimental results of our EWASSR on the Middlebury and Flickr1024 data sets for 2×SR,compared with the recent network,the Peak signal-to-noise ratio(PSNR)increased by 0.37 dB and 0.34 dB,respectively.展开更多
High speed photography technique is potentially the most effective way to measure the motion parameter of warhead fragment benefiting from its advantages of high accuracy,high resolution and high efficiency.However,it...High speed photography technique is potentially the most effective way to measure the motion parameter of warhead fragment benefiting from its advantages of high accuracy,high resolution and high efficiency.However,it faces challenge in dense objects tracking and 3D trajectories reconstruction due to the characteristics of small size and dense distribution of fragment swarm.To address these challenges,this work presents a warhead fragments motion trajectories tracking and spatio-temporal distribution reconstruction method based on high-speed stereo photography.Firstly,background difference algorithm is utilized to extract the center and area of each fragment in the image sequence.Subsequently,a multi-object tracking(MOT)algorithm using Kalman filtering and Hungarian optimal assignment is developed to realize real-time and robust trajectories tracking of fragment swarm.To reconstruct 3D motion trajectories,a global stereo trajectories matching strategy is presented,which takes advantages of epipolar constraint and continuity constraint to correctly retrieve stereo correspondence followed by 3D trajectories refinement using polynomial fitting.Finally,the simulation and experimental results demonstrate that the proposed method can accurately track the motion trajectories and reconstruct the spatio-temporal distribution of 1.0×10^(3)fragments in a field of view(FOV)of 3.2 m×2.5 m,and the accuracy of the velocity estimation can achieve 98.6%.展开更多
The binocular stereo vision is the lowest cost sensor for obtaining 3D information.Considering the weakness of long‐distance measurement and stability,the improvement of accuracy and stability of stereo vision is urg...The binocular stereo vision is the lowest cost sensor for obtaining 3D information.Considering the weakness of long‐distance measurement and stability,the improvement of accuracy and stability of stereo vision is urgently required for application of precision agriculture.To address the challenges of stereo vision long‐distance measurement and stable perception without hardware upgrade,inspired by hawk eyes,higher resolution perception and the adaptive HDR(High Dynamic Range)were introduced in this paper.Simulating the function from physiological structure of‘deep fovea’and‘shallow fovea’of hawk eye,the higher resolution reconstruction method in this paper was aimed at ac-curacy improving.Inspired by adjustment of pupils,the adaptive HDR method was proposed for high dynamic range optimisation and stable perception.In various light conditions,compared with default stereo vision,the accuracy of proposed algorithm was improved by 28.0%evaluated by error ratio,and the stability was improved by 26.56%by disparity accuracy.For fixed distance measurement,the maximum improvement was 78.6%by standard deviation.Based on the hawk‐eye‐inspired perception algorithm,the point cloud of orchard was improved both in quality and quantity.The hawk‐eye‐inspired perception algorithm contributed great advance in binocular 3D point cloud recon-struction in orchard navigation map.展开更多
When training a stereo matching network with a single training dataset, the network may overly rely on the learned features of the single training dataset due to differences in the training dataset scenes, resulting i...When training a stereo matching network with a single training dataset, the network may overly rely on the learned features of the single training dataset due to differences in the training dataset scenes, resulting in poor performance on all datasets. Therefore, feature consistency between matched pixels is a key factor in solving the network’s generalization ability. To address this issue, this paper proposed a more widely applicable stereo matching network that introduced whitening loss into the feature extraction module of stereo matching, and significantly improved the applicability of the network model by constraining the variation between salient feature pixels. In addition, this paper used a GRU iterative update module in the disparity update calculation stage, which expanded the model’s receptive field at multiple resolutions, allowing for precise disparity estimation not only in rich texture areas but also in low texture areas. The model was trained only on the Scene Flow large-scale dataset, and the disparity estimation was conducted on mainstream datasets such as Middlebury, KITTI 2015, and ETH3D. Compared with earlier stereo matching algorithms, this method not only achieves more accurate disparity estimation but also has wider applicability and stronger robustness.展开更多
视差不连续区域和重复纹理区域的误匹配率高一直是影响双目立体匹配测量精度的主要问题,为此,本文提出一种基于多特征融合的立体匹配算法。首先,在代价计算阶段,通过高斯加权法赋予邻域像素点的权值,从而优化绝对差之和(Sum of Absolute...视差不连续区域和重复纹理区域的误匹配率高一直是影响双目立体匹配测量精度的主要问题,为此,本文提出一种基于多特征融合的立体匹配算法。首先,在代价计算阶段,通过高斯加权法赋予邻域像素点的权值,从而优化绝对差之和(Sum of Absolute Differences,SAD)算法的计算精度。接着,基于Census变换改进二进制链码方式,将邻域内像素的平均灰度值与梯度图像的灰度均值相融合,进而建立左右图像对应点的判断依据并优化其编码长度。然后,构建基于十字交叉法与改进的引导滤波器相融合的聚合方法,从而实现视差值再分配,以降低误匹配率。最后,通过赢家通吃(Winner Take All,WTA)算法获取初始视差,并采用左右一致性检测方法及亚像素法提高匹配精度,从而获取最终的视差结果。实验结果表明,在Middlebury数据集的测试中,所提SAD-Census算法的平均非遮挡区域和全部区域的误匹配率为分别为2.67%和5.69%,测量200~900 mm距离的平均误差小于2%;而实际三维测量的最大误差为1.5%。实验结果检验了所提算法的有效性和可靠性。展开更多
当前的汽车安全辅助驾驶和无人驾驶汽车是图像领域的研究热点,针对汽车在启动或行驶时车前存在行人可能导致的安全问题,着重研究了基于双目视觉的车前行人检测方法。进行了双目相机的相机标定和立体标定;通过改进后半全局立体匹配算法...当前的汽车安全辅助驾驶和无人驾驶汽车是图像领域的研究热点,针对汽车在启动或行驶时车前存在行人可能导致的安全问题,着重研究了基于双目视觉的车前行人检测方法。进行了双目相机的相机标定和立体标定;通过改进后半全局立体匹配算法获取深度图,确定车前行人所处位置的感兴趣区域(Region of Interest,ROI),剔除冗余的背景信息;分割并提取了图像的降维梯度直方图(Histogram of Gradients,HOG)特征信息;将特征输入到支持向量机(Support Vector Machine,SVM)分类器训练,检测并标记出车前的行人目标。实验证明,所提算法对车前场景下的动态行人可以更为有效地检测,具备更优的检率精度、时效性和鲁棒性。展开更多
针对多视图立体网络在弱纹理或非朗伯曲面等挑战性区域重建效果差的问题,首先提出一个基于3个并行扩展卷积和注意力机制的多尺度特征提取模块,在增加感受野的同时捕获特征之间的依赖关系以获取全局上下文信息,从而提升多视图立体网络在...针对多视图立体网络在弱纹理或非朗伯曲面等挑战性区域重建效果差的问题,首先提出一个基于3个并行扩展卷积和注意力机制的多尺度特征提取模块,在增加感受野的同时捕获特征之间的依赖关系以获取全局上下文信息,从而提升多视图立体网络在挑战性区域特征的表征能力以进行鲁棒的特征匹配。其次在代价体正则化3D CNN部分引入注意力机制,使网络注意于代价体中的重要区域以进行平滑处理。另外建立一个神经渲染网络,该网络利用渲染参考损失精确地解析辐射场景表达的几何外观信息,并引入深度一致性损失保持多视图立体网络与神经渲染网络之间的几何一致性,有效地缓解有噪声代价体对多视图立体网络的不利影响。该算法在室内DTU数据集中测试,点云重建的完整性和整体性指标分别为0.289和0.326,与基准方法CasMVSNet相比,分别提升24.9%和8.2%,即使在挑战性区域也得到高质量的重建效果;在室外Tanks and Temples中级数据集中,点云重建的平均F-score为60.31,与方法UCS-Net相比提升9.9%,体现出较强的泛化能力。展开更多
基金This work was supported by Sichuan Science and Technology Program(2023YFG0262).
文摘Transformer-based stereo image super-resolution reconstruction(Stereo SR)methods have significantly improved image quality.However,existing methods have deficiencies in paying attention to detailed features and do not consider the offset of pixels along the epipolar lines in complementary views when integrating stereo information.To address these challenges,this paper introduces a novel epipolar line window attention stereo image super-resolution network(EWASSR).For detail feature restoration,we design a feature extractor based on Transformer and convolutional neural network(CNN),which consists of(shifted)window-based self-attention((S)W-MSA)and feature distillation and enhancement blocks(FDEB).This combination effectively solves the problem of global image perception and local feature attention and captures more discriminative high-frequency features of the image.Furthermore,to address the problem of offset of complementary pixels in stereo images,we propose an epipolar line window attention(EWA)mechanism,which divides windows along the epipolar direction to promote efficient matching of shifted pixels,even in pixel smooth areas.More accurate pixel matching can be achieved using adjacent pixels in the window as a reference.Extensive experiments demonstrate that our EWASSR can reconstruct more realistic detailed features.Comparative quantitative results show that in the experimental results of our EWASSR on the Middlebury and Flickr1024 data sets for 2×SR,compared with the recent network,the Peak signal-to-noise ratio(PSNR)increased by 0.37 dB and 0.34 dB,respectively.
基金Key Basic Research Project of Strengthening the Foundations Plan of China (Grant No.2019-JCJQ-ZD-360-12)National Defense Basic Scientific Research Program of China (Grant No.JCKY2021208B011)to provide fund for conducting experiments。
文摘High speed photography technique is potentially the most effective way to measure the motion parameter of warhead fragment benefiting from its advantages of high accuracy,high resolution and high efficiency.However,it faces challenge in dense objects tracking and 3D trajectories reconstruction due to the characteristics of small size and dense distribution of fragment swarm.To address these challenges,this work presents a warhead fragments motion trajectories tracking and spatio-temporal distribution reconstruction method based on high-speed stereo photography.Firstly,background difference algorithm is utilized to extract the center and area of each fragment in the image sequence.Subsequently,a multi-object tracking(MOT)algorithm using Kalman filtering and Hungarian optimal assignment is developed to realize real-time and robust trajectories tracking of fragment swarm.To reconstruct 3D motion trajectories,a global stereo trajectories matching strategy is presented,which takes advantages of epipolar constraint and continuity constraint to correctly retrieve stereo correspondence followed by 3D trajectories refinement using polynomial fitting.Finally,the simulation and experimental results demonstrate that the proposed method can accurately track the motion trajectories and reconstruct the spatio-temporal distribution of 1.0×10^(3)fragments in a field of view(FOV)of 3.2 m×2.5 m,and the accuracy of the velocity estimation can achieve 98.6%.
基金funded by the National Natural Science Foundation of China(No.51979275)Key Laboratory of Spatial‐temporal Big Data Analysis and Application of Nat-ural Resources in Megacities,MNR(No.KFKT‐2022‐05)+3 种基金Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation,Ministry of Natural Resources(No.KF‐2021‐06‐115)Open Project Program of State Key Laboratory of Virtual Reality Technology and Systems,Bei-hang University(No.VRLAB2022C10)Jiangsu Province and Education Ministry Co‐sponsored Synergistic Innovation Center of Modern Agricultural Equipment(No.XTCX2002)2115 Talent Development Program of China Agricultural University and Chinese Universities Scientific Fund(No.2021TC105).
文摘The binocular stereo vision is the lowest cost sensor for obtaining 3D information.Considering the weakness of long‐distance measurement and stability,the improvement of accuracy and stability of stereo vision is urgently required for application of precision agriculture.To address the challenges of stereo vision long‐distance measurement and stable perception without hardware upgrade,inspired by hawk eyes,higher resolution perception and the adaptive HDR(High Dynamic Range)were introduced in this paper.Simulating the function from physiological structure of‘deep fovea’and‘shallow fovea’of hawk eye,the higher resolution reconstruction method in this paper was aimed at ac-curacy improving.Inspired by adjustment of pupils,the adaptive HDR method was proposed for high dynamic range optimisation and stable perception.In various light conditions,compared with default stereo vision,the accuracy of proposed algorithm was improved by 28.0%evaluated by error ratio,and the stability was improved by 26.56%by disparity accuracy.For fixed distance measurement,the maximum improvement was 78.6%by standard deviation.Based on the hawk‐eye‐inspired perception algorithm,the point cloud of orchard was improved both in quality and quantity.The hawk‐eye‐inspired perception algorithm contributed great advance in binocular 3D point cloud recon-struction in orchard navigation map.
文摘When training a stereo matching network with a single training dataset, the network may overly rely on the learned features of the single training dataset due to differences in the training dataset scenes, resulting in poor performance on all datasets. Therefore, feature consistency between matched pixels is a key factor in solving the network’s generalization ability. To address this issue, this paper proposed a more widely applicable stereo matching network that introduced whitening loss into the feature extraction module of stereo matching, and significantly improved the applicability of the network model by constraining the variation between salient feature pixels. In addition, this paper used a GRU iterative update module in the disparity update calculation stage, which expanded the model’s receptive field at multiple resolutions, allowing for precise disparity estimation not only in rich texture areas but also in low texture areas. The model was trained only on the Scene Flow large-scale dataset, and the disparity estimation was conducted on mainstream datasets such as Middlebury, KITTI 2015, and ETH3D. Compared with earlier stereo matching algorithms, this method not only achieves more accurate disparity estimation but also has wider applicability and stronger robustness.
文摘视差不连续区域和重复纹理区域的误匹配率高一直是影响双目立体匹配测量精度的主要问题,为此,本文提出一种基于多特征融合的立体匹配算法。首先,在代价计算阶段,通过高斯加权法赋予邻域像素点的权值,从而优化绝对差之和(Sum of Absolute Differences,SAD)算法的计算精度。接着,基于Census变换改进二进制链码方式,将邻域内像素的平均灰度值与梯度图像的灰度均值相融合,进而建立左右图像对应点的判断依据并优化其编码长度。然后,构建基于十字交叉法与改进的引导滤波器相融合的聚合方法,从而实现视差值再分配,以降低误匹配率。最后,通过赢家通吃(Winner Take All,WTA)算法获取初始视差,并采用左右一致性检测方法及亚像素法提高匹配精度,从而获取最终的视差结果。实验结果表明,在Middlebury数据集的测试中,所提SAD-Census算法的平均非遮挡区域和全部区域的误匹配率为分别为2.67%和5.69%,测量200~900 mm距离的平均误差小于2%;而实际三维测量的最大误差为1.5%。实验结果检验了所提算法的有效性和可靠性。
文摘当前的汽车安全辅助驾驶和无人驾驶汽车是图像领域的研究热点,针对汽车在启动或行驶时车前存在行人可能导致的安全问题,着重研究了基于双目视觉的车前行人检测方法。进行了双目相机的相机标定和立体标定;通过改进后半全局立体匹配算法获取深度图,确定车前行人所处位置的感兴趣区域(Region of Interest,ROI),剔除冗余的背景信息;分割并提取了图像的降维梯度直方图(Histogram of Gradients,HOG)特征信息;将特征输入到支持向量机(Support Vector Machine,SVM)分类器训练,检测并标记出车前的行人目标。实验证明,所提算法对车前场景下的动态行人可以更为有效地检测,具备更优的检率精度、时效性和鲁棒性。
文摘针对多视图立体网络在弱纹理或非朗伯曲面等挑战性区域重建效果差的问题,首先提出一个基于3个并行扩展卷积和注意力机制的多尺度特征提取模块,在增加感受野的同时捕获特征之间的依赖关系以获取全局上下文信息,从而提升多视图立体网络在挑战性区域特征的表征能力以进行鲁棒的特征匹配。其次在代价体正则化3D CNN部分引入注意力机制,使网络注意于代价体中的重要区域以进行平滑处理。另外建立一个神经渲染网络,该网络利用渲染参考损失精确地解析辐射场景表达的几何外观信息,并引入深度一致性损失保持多视图立体网络与神经渲染网络之间的几何一致性,有效地缓解有噪声代价体对多视图立体网络的不利影响。该算法在室内DTU数据集中测试,点云重建的完整性和整体性指标分别为0.289和0.326,与基准方法CasMVSNet相比,分别提升24.9%和8.2%,即使在挑战性区域也得到高质量的重建效果;在室外Tanks and Temples中级数据集中,点云重建的平均F-score为60.31,与方法UCS-Net相比提升9.9%,体现出较强的泛化能力。