Multi-sensor vision system plays an important role in the 3D measurement of large objects.However,due to the widely distribution of sensors,the problem of lacking common fields of view(FOV) arises frequently,which m...Multi-sensor vision system plays an important role in the 3D measurement of large objects.However,due to the widely distribution of sensors,the problem of lacking common fields of view(FOV) arises frequently,which makes the global calibration of the vision system quite difficult.The primary existing solution relies on large-scale surveying equipments,which is ponderous and inconvenient for field calibrations.In this paper,a global calibration method of multi-sensor vision system is proposed and investigated.The proposed method utilizes pairs of skew laser lines,which are generated by a group of laser pointers,as the calibration objects.Each pair of skew laser lines provides a unique coordinate system in space which can be reconstructed in certain vision sensor's coordinates by using a planar pattern.Then the geometries of sensors are computed under rigid transformation constrains by taking coordinates of each skew lines pair as the intermediary.The method is applied on both visual cameras with synthetic data and a real two-camera vision system;results show the validity and good performance.The prime contribution of this paper is taking skew laser lines as the global calibration objects,which makes the method simple and flexible.The method need no expensive equipments and can be used in large-scale calibration.展开更多
In order to achieve a high precision in three-dimensional(3D) multi-camera measurement system, an efficient multi-cameracalibration method is proposed. A stitching method of large scalecalibration targets is deduced...In order to achieve a high precision in three-dimensional(3D) multi-camera measurement system, an efficient multi-cameracalibration method is proposed. A stitching method of large scalecalibration targets is deduced, and a fundamental of multi-cameracalibration based on the large scale calibration target is provided.To avoid the shortcomings of the method, the vector differencesof reprojection error with the presence of the constraint conditionof the constant rigid body transformation is modelled, and mini-mized by the Levenberg-Marquardt (LM) method. Results of thesimulation and observation data calibration experiment show thatthe accuracy of the system calibrated by the proposed methodreaches 2 mm when measuring distance section of 20 000 mmand scale section of 7 000 mm × 7 000 mm. Consequently, theproposed method of multi-camera calibration performs better thanthe fundamental in stability. This technique offers a more uniformerror distribution for measuring large scale space.展开更多
Purpose–This research aims to improve the performance of rail fastener defect inspection method for multi railways,to effectively ensure the safety of railway operation.Design/methodology/approach–Firstly,a fastener...Purpose–This research aims to improve the performance of rail fastener defect inspection method for multi railways,to effectively ensure the safety of railway operation.Design/methodology/approach–Firstly,a fastener region location method based on online learning strategy was proposed,which can locate fastener regions according to the prior knowledge of track image and template matching method.Online learning strategy is used to update the template library dynamically,so that the method not only can locate fastener regions in the track images of multi railways,but also can automatically collect and annotate fastener samples.Secondly,a fastener defect recognition method based on deep convolutional neural network was proposed.The structure of recognition network was designed according to the smaller size and the relatively single content of the fastener region.The data augmentation method based on the sample random sorting strategy is adopted to reduce the impact of the imbalance of sample size on recognition performance.Findings–Test verification of the proposed method is conducted based on the rail fastener datasets of multi railways.Specifically,fastener location module has achieved an average detection rate of 99.36%,and fastener defect recognition module has achieved an average precision of 96.82%.Originality/value–The proposed method can accurately locate fastener regions and identify fastener defect in the track images of different railways,which has high reliability and strong adaptability to multi railways.展开更多
针对传统检测方法中摄像头视角受限问题,提出了一种结合面部姿态矫正和改进ViViT的多视角下人脸疲倦检测方法。采用Mediapipe Face Mesh定位面部三维特征点并将其矫正为正面,利用提出的FGR-ViViT模型来捕捉矫正后的眼睛、眉毛、嘴巴线...针对传统检测方法中摄像头视角受限问题,提出了一种结合面部姿态矫正和改进ViViT的多视角下人脸疲倦检测方法。采用Mediapipe Face Mesh定位面部三维特征点并将其矫正为正面,利用提出的FGR-ViViT模型来捕捉矫正后的眼睛、眉毛、嘴巴线条图像帧序列变化。FGR-ViViT通过在ViViT的Temporal Transformer Encoder中添加部件选择模块来捕捉特征在时间维度中的细微差异,同时融合2次dropout和改进的对比损失函数来调整样本的相似性,降低模型过拟合风险并提高泛化能力。实验结果表明,提出的方法在YawDD和DROZY矫正后的线条图像帧的测试集上,F1-分数达到了94.5%和97.6%,相较于原始人脸图像帧分别提高了3.2%和10.4%,其FGR-ViViT相较于原始ViViT分别提高了6.1%和0.7%。所提方法适用于摄像头灵活摆放的多种应用场景,对解决多视角人脸睡意判断具有积极意义。展开更多
针对在基于深度学习技术的特征提取网络中,深层次的卷积神经网络提取的特征缺乏低级语义信息的问题,该文提出了语义增强的多视立体视觉方法。首先,提出了一种ConvLSTM(Convolutional Long Short-Term Memory)语义聚合网络,通过使用ConvL...针对在基于深度学习技术的特征提取网络中,深层次的卷积神经网络提取的特征缺乏低级语义信息的问题,该文提出了语义增强的多视立体视觉方法。首先,提出了一种ConvLSTM(Convolutional Long Short-Term Memory)语义聚合网络,通过使用ConvLSTM网络结构,对多个卷积层提取的特征图进行预测,得到融合每层语义信息的特征图,有助于在空间上层层抽取图像的高级特征时,利用长短期记忆神经网络结构的记忆功能来增强高层特征图中的低级语义信息,提高了弱纹理区域的重建效果,提高了3D重建的鲁棒性和完整性;其次,提出了一种可见性网络,在灰度图的基础上,通过突出特征图上可见区域的特征,加深了可见区域在特征图中的影响,有助于提高三维重建效果;最后,提取图像的纹理信息,并进入ConvLSTM语义聚合网络提取深层次特征,提高了弱纹理区域的重建效果。与主流的多视立体视觉重建方法相比,重建效果较好。展开更多
基金supported by National Natural Science Foundation of China (Grant No. 60804060)Research Fund for the Doctoral Program of Higher Education of China (Grant No. 200800061003)
文摘Multi-sensor vision system plays an important role in the 3D measurement of large objects.However,due to the widely distribution of sensors,the problem of lacking common fields of view(FOV) arises frequently,which makes the global calibration of the vision system quite difficult.The primary existing solution relies on large-scale surveying equipments,which is ponderous and inconvenient for field calibrations.In this paper,a global calibration method of multi-sensor vision system is proposed and investigated.The proposed method utilizes pairs of skew laser lines,which are generated by a group of laser pointers,as the calibration objects.Each pair of skew laser lines provides a unique coordinate system in space which can be reconstructed in certain vision sensor's coordinates by using a planar pattern.Then the geometries of sensors are computed under rigid transformation constrains by taking coordinates of each skew lines pair as the intermediary.The method is applied on both visual cameras with synthetic data and a real two-camera vision system;results show the validity and good performance.The prime contribution of this paper is taking skew laser lines as the global calibration objects,which makes the method simple and flexible.The method need no expensive equipments and can be used in large-scale calibration.
基金supported by the National Natural Science Foundation of China(61473100)
文摘In order to achieve a high precision in three-dimensional(3D) multi-camera measurement system, an efficient multi-cameracalibration method is proposed. A stitching method of large scalecalibration targets is deduced, and a fundamental of multi-cameracalibration based on the large scale calibration target is provided.To avoid the shortcomings of the method, the vector differencesof reprojection error with the presence of the constraint conditionof the constant rigid body transformation is modelled, and mini-mized by the Levenberg-Marquardt (LM) method. Results of thesimulation and observation data calibration experiment show thatthe accuracy of the system calibrated by the proposed methodreaches 2 mm when measuring distance section of 20 000 mmand scale section of 7 000 mm × 7 000 mm. Consequently, theproposed method of multi-camera calibration performs better thanthe fundamental in stability. This technique offers a more uniformerror distribution for measuring large scale space.
基金funded by the Key Research and Development Project of China Academy of Railway Sciences Corporation Limited(2021YJ310).
文摘Purpose–This research aims to improve the performance of rail fastener defect inspection method for multi railways,to effectively ensure the safety of railway operation.Design/methodology/approach–Firstly,a fastener region location method based on online learning strategy was proposed,which can locate fastener regions according to the prior knowledge of track image and template matching method.Online learning strategy is used to update the template library dynamically,so that the method not only can locate fastener regions in the track images of multi railways,but also can automatically collect and annotate fastener samples.Secondly,a fastener defect recognition method based on deep convolutional neural network was proposed.The structure of recognition network was designed according to the smaller size and the relatively single content of the fastener region.The data augmentation method based on the sample random sorting strategy is adopted to reduce the impact of the imbalance of sample size on recognition performance.Findings–Test verification of the proposed method is conducted based on the rail fastener datasets of multi railways.Specifically,fastener location module has achieved an average detection rate of 99.36%,and fastener defect recognition module has achieved an average precision of 96.82%.Originality/value–The proposed method can accurately locate fastener regions and identify fastener defect in the track images of different railways,which has high reliability and strong adaptability to multi railways.
文摘针对传统检测方法中摄像头视角受限问题,提出了一种结合面部姿态矫正和改进ViViT的多视角下人脸疲倦检测方法。采用Mediapipe Face Mesh定位面部三维特征点并将其矫正为正面,利用提出的FGR-ViViT模型来捕捉矫正后的眼睛、眉毛、嘴巴线条图像帧序列变化。FGR-ViViT通过在ViViT的Temporal Transformer Encoder中添加部件选择模块来捕捉特征在时间维度中的细微差异,同时融合2次dropout和改进的对比损失函数来调整样本的相似性,降低模型过拟合风险并提高泛化能力。实验结果表明,提出的方法在YawDD和DROZY矫正后的线条图像帧的测试集上,F1-分数达到了94.5%和97.6%,相较于原始人脸图像帧分别提高了3.2%和10.4%,其FGR-ViViT相较于原始ViViT分别提高了6.1%和0.7%。所提方法适用于摄像头灵活摆放的多种应用场景,对解决多视角人脸睡意判断具有积极意义。
文摘针对在基于深度学习技术的特征提取网络中,深层次的卷积神经网络提取的特征缺乏低级语义信息的问题,该文提出了语义增强的多视立体视觉方法。首先,提出了一种ConvLSTM(Convolutional Long Short-Term Memory)语义聚合网络,通过使用ConvLSTM网络结构,对多个卷积层提取的特征图进行预测,得到融合每层语义信息的特征图,有助于在空间上层层抽取图像的高级特征时,利用长短期记忆神经网络结构的记忆功能来增强高层特征图中的低级语义信息,提高了弱纹理区域的重建效果,提高了3D重建的鲁棒性和完整性;其次,提出了一种可见性网络,在灰度图的基础上,通过突出特征图上可见区域的特征,加深了可见区域在特征图中的影响,有助于提高三维重建效果;最后,提取图像的纹理信息,并进入ConvLSTM语义聚合网络提取深层次特征,提高了弱纹理区域的重建效果。与主流的多视立体视觉重建方法相比,重建效果较好。