道路病害的诊断是道路预防保养的一个关键步骤,为此本文提出了一种基于RDNet (Road Detection Network)道路病害检测算法。该算法从不同角度提高了特征的提取和表达能力,其中的改进包括跨阶段多分支卷积、残差并行空洞卷积以及自适应尺...道路病害的诊断是道路预防保养的一个关键步骤,为此本文提出了一种基于RDNet (Road Detection Network)道路病害检测算法。该算法从不同角度提高了特征的提取和表达能力,其中的改进包括跨阶段多分支卷积、残差并行空洞卷积以及自适应尺度空间注意力模块等。通过在自建的道路病害数据集上进行端到端地训练,提高了算法的检测精度和泛化能力。实验结果表明,对比YOLOv5s,本文所提出的RDNet算法的平均精度均值mAP提高了1.3%,同时对于困难样本也有较好的检测结果,能够有效地应用于实际道路的维护工作中,从而提升道路病害检测的效率和准确性。The diagnosis of road diseases is a key step in road preventive maintenance, so this paper proposes a road disease detection algorithm based on RDNet (Road Detection Network). The algorithm improves the ability of feature extraction and expression from different perspectives, including crossstage partial multi-branch convolution, residual parallel dilated convolution, and adaptive scale spatial attention module. End-to-end training on the self-built road disease dataset improves the detection accuracy and generalization ability of the algorithm. Experimental results show that compared with YOLOv5s, the average precision of the RDNet algorithm proposed in this paper is increased by 1.3%, and the average precision mAP of the proposed RDNet algorithm is improved by 1.3%, and it also has good detection results for difficult samples, which can be effectively applied to the maintenance of actual roads, so as to improve the efficiency and accuracy of road disease detection.展开更多
针对当前的图像字幕方法只能够用一种黑盒的、从外部难以控制的架构描述图像的问题。创造性地将图像字幕问题转换为seq2seq问题,达到了可控生成图像字幕的效果。设计一个由图像区域构成的实体集合或实体序列作为控制信号,在实体块切换...针对当前的图像字幕方法只能够用一种黑盒的、从外部难以控制的架构描述图像的问题。创造性地将图像字幕问题转换为seq2seq问题,达到了可控生成图像字幕的效果。设计一个由图像区域构成的实体集合或实体序列作为控制信号,在实体块切换的块哨兵和带视觉哨兵的自适应注意力机制的指导下,将控制信号有规律地输入到双层的长短期记忆网络(long short term memory,LSTM)中,以可控的方式指导模型生成对应的图像字幕;此外,baseline使用cross entropy loss来早停模型的训练,引入强化学习思想来解决训练时的优化目标与评估算法效果时指标不一致的问题,进一步优化模型效果。实验表明:在MSCOCO及Flickr30k数据集上,提出的算法在生成可控图像字幕、字幕质量、多样性上达到了非常好的效果。展开更多
In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is pro...In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is proposed,which makes use of multi-neighborhood information of voxel and image information.Firstly,design an improved ResNet that maintains the structure information of far and hard objects in low-resolution feature maps,which is more suitable for detection task.Meanwhile,semantema of each image feature map is enhanced by semantic information from all subsequent feature maps.Secondly,extract multi-neighborhood context information with different receptive field sizes to make up for the defect of sparseness of point cloud which improves the ability of voxel features to represent the spatial structure and semantic information of objects.Finally,propose a multi-modal feature adaptive fusion strategy which uses learnable weights to express the contribution of different modal features to the detection task,and voxel attention further enhances the fused feature expression of effective target objects.The experimental results on the KITTI benchmark show that this method outperforms VoxelNet with remarkable margins,i.e.increasing the AP by 8.78%and 5.49%on medium and hard difficulty levels.Meanwhile,our method achieves greater detection performance compared with many mainstream multi-modal methods,i.e.outperforming the AP by 1%compared with that of MVX-Net on medium and hard difficulty levels.展开更多
针对风电场监控和数据采集系统(supervisory control and data acquisition, SCADA)数据在采集传输过程中常遇到的数据丢失问题,提出一种新的自适应轻量化生成对抗网络插补策略(adaptive transformer slim GAIN, ATSGAIN),旨在增强数据...针对风电场监控和数据采集系统(supervisory control and data acquisition, SCADA)数据在采集传输过程中常遇到的数据丢失问题,提出一种新的自适应轻量化生成对抗网络插补策略(adaptive transformer slim GAIN, ATSGAIN),旨在增强数据完整性. AT-SGAIN通过简化GAIN模型结构,显著提高了计算效率;采用双判别器结构,分别用于真实数据和生成数据的鉴别,保障了速度提升过程中插补精度的维护.算法集成了Transformer (变压器模型)编码器,增强了对风电数据时间序列特征的捕捉能力,并通过自适应双分支注意力机制,精准调整通道和空间注意力权重,提升了网络对局部信息的敏感度.实验结果证明,所提算法在多项对比测试中均显著优于现有经典方法.展开更多
文摘道路病害的诊断是道路预防保养的一个关键步骤,为此本文提出了一种基于RDNet (Road Detection Network)道路病害检测算法。该算法从不同角度提高了特征的提取和表达能力,其中的改进包括跨阶段多分支卷积、残差并行空洞卷积以及自适应尺度空间注意力模块等。通过在自建的道路病害数据集上进行端到端地训练,提高了算法的检测精度和泛化能力。实验结果表明,对比YOLOv5s,本文所提出的RDNet算法的平均精度均值mAP提高了1.3%,同时对于困难样本也有较好的检测结果,能够有效地应用于实际道路的维护工作中,从而提升道路病害检测的效率和准确性。The diagnosis of road diseases is a key step in road preventive maintenance, so this paper proposes a road disease detection algorithm based on RDNet (Road Detection Network). The algorithm improves the ability of feature extraction and expression from different perspectives, including crossstage partial multi-branch convolution, residual parallel dilated convolution, and adaptive scale spatial attention module. End-to-end training on the self-built road disease dataset improves the detection accuracy and generalization ability of the algorithm. Experimental results show that compared with YOLOv5s, the average precision of the RDNet algorithm proposed in this paper is increased by 1.3%, and the average precision mAP of the proposed RDNet algorithm is improved by 1.3%, and it also has good detection results for difficult samples, which can be effectively applied to the maintenance of actual roads, so as to improve the efficiency and accuracy of road disease detection.
文摘针对当前的图像字幕方法只能够用一种黑盒的、从外部难以控制的架构描述图像的问题。创造性地将图像字幕问题转换为seq2seq问题,达到了可控生成图像字幕的效果。设计一个由图像区域构成的实体集合或实体序列作为控制信号,在实体块切换的块哨兵和带视觉哨兵的自适应注意力机制的指导下,将控制信号有规律地输入到双层的长短期记忆网络(long short term memory,LSTM)中,以可控的方式指导模型生成对应的图像字幕;此外,baseline使用cross entropy loss来早停模型的训练,引入强化学习思想来解决训练时的优化目标与评估算法效果时指标不一致的问题,进一步优化模型效果。实验表明:在MSCOCO及Flickr30k数据集上,提出的算法在生成可控图像字幕、字幕质量、多样性上达到了非常好的效果。
基金National Youth Natural Science Foundation of China(No.61806006)Innovation Program for Graduate of Jiangsu Province(No.KYLX160-781)Jiangsu University Superior Discipline Construction Project。
文摘In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is proposed,which makes use of multi-neighborhood information of voxel and image information.Firstly,design an improved ResNet that maintains the structure information of far and hard objects in low-resolution feature maps,which is more suitable for detection task.Meanwhile,semantema of each image feature map is enhanced by semantic information from all subsequent feature maps.Secondly,extract multi-neighborhood context information with different receptive field sizes to make up for the defect of sparseness of point cloud which improves the ability of voxel features to represent the spatial structure and semantic information of objects.Finally,propose a multi-modal feature adaptive fusion strategy which uses learnable weights to express the contribution of different modal features to the detection task,and voxel attention further enhances the fused feature expression of effective target objects.The experimental results on the KITTI benchmark show that this method outperforms VoxelNet with remarkable margins,i.e.increasing the AP by 8.78%and 5.49%on medium and hard difficulty levels.Meanwhile,our method achieves greater detection performance compared with many mainstream multi-modal methods,i.e.outperforming the AP by 1%compared with that of MVX-Net on medium and hard difficulty levels.
文摘针对风电场监控和数据采集系统(supervisory control and data acquisition, SCADA)数据在采集传输过程中常遇到的数据丢失问题,提出一种新的自适应轻量化生成对抗网络插补策略(adaptive transformer slim GAIN, ATSGAIN),旨在增强数据完整性. AT-SGAIN通过简化GAIN模型结构,显著提高了计算效率;采用双判别器结构,分别用于真实数据和生成数据的鉴别,保障了速度提升过程中插补精度的维护.算法集成了Transformer (变压器模型)编码器,增强了对风电数据时间序列特征的捕捉能力,并通过自适应双分支注意力机制,精准调整通道和空间注意力权重,提升了网络对局部信息的敏感度.实验结果证明,所提算法在多项对比测试中均显著优于现有经典方法.