The real-time detection and instance segmentation of strawberries constitute fundamental components in the development of strawberry harvesting robots.Real-time identification of strawberries in an unstructured envi-r...The real-time detection and instance segmentation of strawberries constitute fundamental components in the development of strawberry harvesting robots.Real-time identification of strawberries in an unstructured envi-ronment is a challenging task.Current instance segmentation algorithms for strawberries suffer from issues such as poor real-time performance and low accuracy.To this end,the present study proposes an Efficient YOLACT(E-YOLACT)algorithm for strawberry detection and segmentation based on the YOLACT framework.The key enhancements of the E-YOLACT encompass the development of a lightweight attention mechanism,pyramid squeeze shuffle attention(PSSA),for efficient feature extraction.Additionally,an attention-guided context-feature pyramid network(AC-FPN)is employed instead of FPN to optimize the architecture’s performance.Furthermore,a feature-enhanced model(FEM)is introduced to enhance the prediction head’s capabilities,while efficient fast non-maximum suppression(EF-NMS)is devised to improve non-maximum suppression.The experimental results demonstrate that the E-YOLACT achieves a Box-mAP and Mask-mAP of 77.9 and 76.6,respectively,on the custom dataset.Moreover,it exhibits an impressive category accuracy of 93.5%.Notably,the E-YOLACT also demonstrates a remarkable real-time detection capability with a speed of 34.8 FPS.The method proposed in this article presents an efficient approach for the vision system of a strawberry-picking robot.展开更多
Autonomous driving technology has made a lot of outstanding achievements with deep learning,and the vehicle detection and classification algorithm has become one of the critical technologies of autonomous driving syst...Autonomous driving technology has made a lot of outstanding achievements with deep learning,and the vehicle detection and classification algorithm has become one of the critical technologies of autonomous driving systems.The vehicle instance segmentation can perform instance-level semantic parsing of vehicle information,which is more accurate and reliable than object detection.However,the existing instance segmentation algorithms still have the problems of poor mask prediction accuracy and low detection speed.Therefore,this paper proposes an advanced real-time instance segmentation model named FIR-YOLACT,which fuses the ICIoU(Improved Complete Intersection over Union)and Res2Net for the YOLACT algorithm.Specifically,the ICIoU function can effectively solve the degradation problem of the original CIoU loss function,and improve the training convergence speed and detection accuracy.The Res2Net module fused with the ECA(Efficient Channel Attention)Net is added to the model’s backbone network,which improves the multi-scale detection capability and mask prediction accuracy.Furthermore,the Cluster NMS(Non-Maximum Suppression)algorithm is introduced in the model’s bounding box regression to enhance the performance of detecting similarly occluded objects.The experimental results demonstrate the superiority of FIR-YOLACT to the based methods and the effectiveness of all components.The processing speed reaches 28 FPS,which meets the demands of real-time vehicle instance segmentation.展开更多
Accurate perception of lane line information is one of the basic requirements of unmanned driving technology, which is related to the localization of the vehicle and the determination of the forward direction. In this...Accurate perception of lane line information is one of the basic requirements of unmanned driving technology, which is related to the localization of the vehicle and the determination of the forward direction. In this paper, multi-level constraints are added to the lane line detection model PINet, which is used to improve the perception of lane lines. Predicted lane lines in the network are predicted to have real and imaginary attributes, which are used to enhance the perception of features around the lane lines, with pixel-level constraints on the lane lines;images are converted to bird’s-eye views, where the parallelism between lane lines is reconstructed, with lane line-level constraints on the predicted lane lines;and vanishing points are used to focus on the image hierarchy, with image-level constraints on the lane lines. The model proposed in this paper meets both accuracy (96.44%) and real-time (30 + FPS) requirements, has been tested on the highway on the ground, and has performed stably.展开更多
This work proposes a method for the detection and identification of parked vehicles stationed. This technique composed many algorithms for the detection, localization, segmentation, extraction and recognition of numbe...This work proposes a method for the detection and identification of parked vehicles stationed. This technique composed many algorithms for the detection, localization, segmentation, extraction and recognition of number plates in images. It is acts of a technology of image processing used to identify the vehicles by their number plates. Knowing that we work on images whose level of gray is sampled with (120×180), resulting from a base of abundant data by PSA. We present two algorithms allowing the detection of the horizontal position of the vehicle: the classical method “horizontal gradients” and our approach “symmetrical method”. In fact, a car seen from the front presents a symmetry plan and by detecting its axis, that one finds its position in the image. A phase of localization is treated using the parameter MGD (Maximum Gradient Difference) which allows locating all the segments of text per horizontal scan. A specific technique of filtering, combining the method of symmetry and the localization by the MGD allows eliminating the blocks which don’t pass by the axis of symmetry and thus find the good block containing the number plate. Once we locate the plate, we use four algorithms that must be realized in order to allow our system to identify a license plate. The first algorithm is adjusting the intensity and the contrast of the image. The second algorithm is segmenting the characters on the plate using profile method. Then extracting and resizing the characters and finally recognizing them by means of optical character recogni-tion OCR. The efficiency of these algorithms is shown using a database of 350 images for the tests. We find a rate of lo-calization of 99.6% on a basis of 350 images with a rate of false alarms (wrong block text) of 0.88% by image.展开更多
Traffic sign recognition (TSR, or Road Sign Recognition, RSR) is one of the Advanced Driver Assistance System (ADAS) devices in modern cars. To concern the most important issues, which are real-time and resource effic...Traffic sign recognition (TSR, or Road Sign Recognition, RSR) is one of the Advanced Driver Assistance System (ADAS) devices in modern cars. To concern the most important issues, which are real-time and resource efficiency, we propose a high efficiency hardware implementation for TSR. We divide the TSR procedure into two stages, detection and recognition. In the detection stage, under the assumption that most German traffic signs have red or blue colors with circle, triangle or rectangle shapes, we use Normalized RGB color transform and Single-Pass Connected Component Labeling (CCL) to find the potential traffic signs efficiently. For Single-Pass CCL, our contribution is to eliminate the “merge-stack” operations by recording connected relations of region in the scan phase and updating the labels in the iterating phase. In the recognition stage, the Histogram of Oriented Gradient (HOG) is used to generate the descriptor of the signs, and we classify the signs with Support Vector Machine (SVM). In the HOG module, we analyze the required minimum bits under different recognition rate. The proposed method achieves 96.61% detection rate and 90.85% recognition rate while testing with the GTSDB dataset. Our hardware implementation reduces the storage of CCL and simplifies the HOG computation. Main CCL storage size is reduced by 20% comparing to the most advanced design under typical condition. By using TSMC 90 nm technology, the proposed design operates at 105 MHz clock rate and processes in 135 fps with the image size of 1360 × 800. The chip size is about 1 mm2 and the power consumption is close to 8 mW. Therefore, this work is resource efficient and achieves real-time requirement.展开更多
Over the last decades,an expansion of the underground network has been taking place to cope with the increasing amount of moving people and freight.As a consequence,it is of vital importance to guarantee the full func...Over the last decades,an expansion of the underground network has been taking place to cope with the increasing amount of moving people and freight.As a consequence,it is of vital importance to guarantee the full functionality of the tunnel network by means of preventive maintenance and the monitoring of the tunnel lining state over time.A new method has been developed for the real-time prediction of the utilization level in tunnel segmental linings based on input monitoring data.The new concept is founded on a framework,which encompasses an offline and an online stage.In the former,the generation of feedforward neural networks is accomplished by employing synthetically produced data.Finite element simulations of the lining structure are conducted to analyze the structural response under multiple loading conditions.The scenarios are generated by assuming ranges of variation of the model input parameters to account for the uncertainty due to the not fully determined in situ conditions.Input and target quantities are identified to better assess the structural utilization of the lining.The latter phase consists in the application of the methodological framework on input monitored data,which allows for a real-time prediction of the physical quantities deployed for the estimation of the lining utilization.The approach is validated on a full-scale test of segmental lining,where the predicted quantities are compared with the actual measurements.Finally,it is investigated the influence of artificial noise added to the training data on the overall prediction performances and the benefits along with the limits of the concept are set out.展开更多
Recently,the development and application of lane line departure warning systems have been in the market.For any of the systems,the key part of lane line tracking,lane line identification,or lane line departure warning...Recently,the development and application of lane line departure warning systems have been in the market.For any of the systems,the key part of lane line tracking,lane line identification,or lane line departure warning is whether it can accurately and quickly detect lane lines.Since 1990 s,they have been studied and implemented for the situations defined by the good viewing conditions and the clear lane markings on road.After then,the accuracy for particular situations,the robustness for a wide range of scenarios,time efficiency and integration into higher-order tasks define visual lane line detection and tracking as a continuing research subject.At present,these kinds of lane marking line detection methods based on machine vision and image processing can be divided into two categories:the traditional image processing and semantic segmentation(includes deep learning)methods.The former mainly involves feature-based and model-based steps,and which can be classified into similarity-and discontinuity-based ones;and the model-based step includes different parametric straight line,curve or pattern models.The semantic segmentation includes different machine learning,neural network and deep learning methods,which is the new trend for the research and application of lane line departure warning systems.This paper describes and analyzes the lane line departure warning systems,image processing algorithms and semantic segmentation methods for lane line detection.展开更多
针对修建在高寒区的隧道衬砌存在的所处环境恶劣、冻害频发、衬砌图像干扰因素多、冻害目标尺度不一致及传统人工目视检测方法效率低下且成本昂贵等问题,提出了基于HRNetV2的高寒区隧道衬砌冻害检测方法。首先以HRNetV2为基础模型,提出...针对修建在高寒区的隧道衬砌存在的所处环境恶劣、冻害频发、衬砌图像干扰因素多、冻害目标尺度不一致及传统人工目视检测方法效率低下且成本昂贵等问题,提出了基于HRNetV2的高寒区隧道衬砌冻害检测方法。首先以HRNetV2为基础模型,提出改进模型,在主干特征提取网络结合迁移学习的知识,在结构中引入注意力机制以加强模型对于冻害特征的学习能力,并使用Focalloss作为损失函数以解决类别不平衡问题。为验证改进后模型的性能,使用高清摄像头采集高寒区隧道衬砌冻害图像,经过裁剪及数据增强等手段,建立一个包含2800张图像的冻害数据集。实验结果表明,改进后的模型在冻害数据集上的平均交并比(mean intersection over union,mIoU)可达到89.05%,相比原始模型提升了5.41%,在面对复杂形态冻害时展现出较好的鲁棒性,可直接应用于高分辨率原图;且在综合性能上优于DeeplabV3+、U-Net、PSPNet三种模型。所提方法可准确、安全地实现衬砌冻害智能检测,可为高寒区隧道智能化运维提供一定技术支持。展开更多
基金funded by Anhui Provincial Natural Science Foundation(No.2208085ME128)the Anhui University-Level Special Project of Anhui University of Science and Technology(No.XCZX2021-01)+1 种基金the Research and the Development Fund of the Institute of Environmental Friendly Materials and Occupational Health,Anhui University of Science and Technology(No.ALW2022YF06)Anhui Province New Era Education Quality Project(Graduate Education)(No.2022xscx073).
文摘The real-time detection and instance segmentation of strawberries constitute fundamental components in the development of strawberry harvesting robots.Real-time identification of strawberries in an unstructured envi-ronment is a challenging task.Current instance segmentation algorithms for strawberries suffer from issues such as poor real-time performance and low accuracy.To this end,the present study proposes an Efficient YOLACT(E-YOLACT)algorithm for strawberry detection and segmentation based on the YOLACT framework.The key enhancements of the E-YOLACT encompass the development of a lightweight attention mechanism,pyramid squeeze shuffle attention(PSSA),for efficient feature extraction.Additionally,an attention-guided context-feature pyramid network(AC-FPN)is employed instead of FPN to optimize the architecture’s performance.Furthermore,a feature-enhanced model(FEM)is introduced to enhance the prediction head’s capabilities,while efficient fast non-maximum suppression(EF-NMS)is devised to improve non-maximum suppression.The experimental results demonstrate that the E-YOLACT achieves a Box-mAP and Mask-mAP of 77.9 and 76.6,respectively,on the custom dataset.Moreover,it exhibits an impressive category accuracy of 93.5%.Notably,the E-YOLACT also demonstrates a remarkable real-time detection capability with a speed of 34.8 FPS.The method proposed in this article presents an efficient approach for the vision system of a strawberry-picking robot.
基金supported by the Natural Science Foundation of Guizhou Province(Grant Number:20161054)Joint Natural Science Foundation of Guizhou Province(Grant Number:LH20177226)+1 种基金2017 Special Project of New Academic Talent Training and Innovation Exploration of Guizhou University(Grant Number:20175788)The National Natural Science Foundation of China under Grant No.12205062.
文摘Autonomous driving technology has made a lot of outstanding achievements with deep learning,and the vehicle detection and classification algorithm has become one of the critical technologies of autonomous driving systems.The vehicle instance segmentation can perform instance-level semantic parsing of vehicle information,which is more accurate and reliable than object detection.However,the existing instance segmentation algorithms still have the problems of poor mask prediction accuracy and low detection speed.Therefore,this paper proposes an advanced real-time instance segmentation model named FIR-YOLACT,which fuses the ICIoU(Improved Complete Intersection over Union)and Res2Net for the YOLACT algorithm.Specifically,the ICIoU function can effectively solve the degradation problem of the original CIoU loss function,and improve the training convergence speed and detection accuracy.The Res2Net module fused with the ECA(Efficient Channel Attention)Net is added to the model’s backbone network,which improves the multi-scale detection capability and mask prediction accuracy.Furthermore,the Cluster NMS(Non-Maximum Suppression)algorithm is introduced in the model’s bounding box regression to enhance the performance of detecting similarly occluded objects.The experimental results demonstrate the superiority of FIR-YOLACT to the based methods and the effectiveness of all components.The processing speed reaches 28 FPS,which meets the demands of real-time vehicle instance segmentation.
文摘Accurate perception of lane line information is one of the basic requirements of unmanned driving technology, which is related to the localization of the vehicle and the determination of the forward direction. In this paper, multi-level constraints are added to the lane line detection model PINet, which is used to improve the perception of lane lines. Predicted lane lines in the network are predicted to have real and imaginary attributes, which are used to enhance the perception of features around the lane lines, with pixel-level constraints on the lane lines;images are converted to bird’s-eye views, where the parallelism between lane lines is reconstructed, with lane line-level constraints on the predicted lane lines;and vanishing points are used to focus on the image hierarchy, with image-level constraints on the lane lines. The model proposed in this paper meets both accuracy (96.44%) and real-time (30 + FPS) requirements, has been tested on the highway on the ground, and has performed stably.
文摘This work proposes a method for the detection and identification of parked vehicles stationed. This technique composed many algorithms for the detection, localization, segmentation, extraction and recognition of number plates in images. It is acts of a technology of image processing used to identify the vehicles by their number plates. Knowing that we work on images whose level of gray is sampled with (120×180), resulting from a base of abundant data by PSA. We present two algorithms allowing the detection of the horizontal position of the vehicle: the classical method “horizontal gradients” and our approach “symmetrical method”. In fact, a car seen from the front presents a symmetry plan and by detecting its axis, that one finds its position in the image. A phase of localization is treated using the parameter MGD (Maximum Gradient Difference) which allows locating all the segments of text per horizontal scan. A specific technique of filtering, combining the method of symmetry and the localization by the MGD allows eliminating the blocks which don’t pass by the axis of symmetry and thus find the good block containing the number plate. Once we locate the plate, we use four algorithms that must be realized in order to allow our system to identify a license plate. The first algorithm is adjusting the intensity and the contrast of the image. The second algorithm is segmenting the characters on the plate using profile method. Then extracting and resizing the characters and finally recognizing them by means of optical character recogni-tion OCR. The efficiency of these algorithms is shown using a database of 350 images for the tests. We find a rate of lo-calization of 99.6% on a basis of 350 images with a rate of false alarms (wrong block text) of 0.88% by image.
文摘Traffic sign recognition (TSR, or Road Sign Recognition, RSR) is one of the Advanced Driver Assistance System (ADAS) devices in modern cars. To concern the most important issues, which are real-time and resource efficiency, we propose a high efficiency hardware implementation for TSR. We divide the TSR procedure into two stages, detection and recognition. In the detection stage, under the assumption that most German traffic signs have red or blue colors with circle, triangle or rectangle shapes, we use Normalized RGB color transform and Single-Pass Connected Component Labeling (CCL) to find the potential traffic signs efficiently. For Single-Pass CCL, our contribution is to eliminate the “merge-stack” operations by recording connected relations of region in the scan phase and updating the labels in the iterating phase. In the recognition stage, the Histogram of Oriented Gradient (HOG) is used to generate the descriptor of the signs, and we classify the signs with Support Vector Machine (SVM). In the HOG module, we analyze the required minimum bits under different recognition rate. The proposed method achieves 96.61% detection rate and 90.85% recognition rate while testing with the GTSDB dataset. Our hardware implementation reduces the storage of CCL and simplifies the HOG computation. Main CCL storage size is reduced by 20% comparing to the most advanced design under typical condition. By using TSMC 90 nm technology, the proposed design operates at 105 MHz clock rate and processes in 135 fps with the image size of 1360 × 800. The chip size is about 1 mm2 and the power consumption is close to 8 mW. Therefore, this work is resource efficient and achieves real-time requirement.
基金funded by the Deutsche Forschungsgemeinschaft(DFG,German Research Foundation,Project No.77309832)within Subprojects C1 and B2 of the Collaborative Research Center SFB 837"Interaction Modeling in Mechanised Tunnelling",sited at the Ruhr University Bochum,Germany.
文摘Over the last decades,an expansion of the underground network has been taking place to cope with the increasing amount of moving people and freight.As a consequence,it is of vital importance to guarantee the full functionality of the tunnel network by means of preventive maintenance and the monitoring of the tunnel lining state over time.A new method has been developed for the real-time prediction of the utilization level in tunnel segmental linings based on input monitoring data.The new concept is founded on a framework,which encompasses an offline and an online stage.In the former,the generation of feedforward neural networks is accomplished by employing synthetically produced data.Finite element simulations of the lining structure are conducted to analyze the structural response under multiple loading conditions.The scenarios are generated by assuming ranges of variation of the model input parameters to account for the uncertainty due to the not fully determined in situ conditions.Input and target quantities are identified to better assess the structural utilization of the lining.The latter phase consists in the application of the methodological framework on input monitored data,which allows for a real-time prediction of the physical quantities deployed for the estimation of the lining utilization.The approach is validated on a full-scale test of segmental lining,where the predicted quantities are compared with the actual measurements.Finally,it is investigated the influence of artificial noise added to the training data on the overall prediction performances and the benefits along with the limits of the concept are set out.
基金financially supported by the National Natural Science Foundation of China(grant No.61170147)the Scientific and Technological Project of Shaanxi Province in China(grant No.2019GY-038)。
文摘Recently,the development and application of lane line departure warning systems have been in the market.For any of the systems,the key part of lane line tracking,lane line identification,or lane line departure warning is whether it can accurately and quickly detect lane lines.Since 1990 s,they have been studied and implemented for the situations defined by the good viewing conditions and the clear lane markings on road.After then,the accuracy for particular situations,the robustness for a wide range of scenarios,time efficiency and integration into higher-order tasks define visual lane line detection and tracking as a continuing research subject.At present,these kinds of lane marking line detection methods based on machine vision and image processing can be divided into two categories:the traditional image processing and semantic segmentation(includes deep learning)methods.The former mainly involves feature-based and model-based steps,and which can be classified into similarity-and discontinuity-based ones;and the model-based step includes different parametric straight line,curve or pattern models.The semantic segmentation includes different machine learning,neural network and deep learning methods,which is the new trend for the research and application of lane line departure warning systems.This paper describes and analyzes the lane line departure warning systems,image processing algorithms and semantic segmentation methods for lane line detection.
文摘针对修建在高寒区的隧道衬砌存在的所处环境恶劣、冻害频发、衬砌图像干扰因素多、冻害目标尺度不一致及传统人工目视检测方法效率低下且成本昂贵等问题,提出了基于HRNetV2的高寒区隧道衬砌冻害检测方法。首先以HRNetV2为基础模型,提出改进模型,在主干特征提取网络结合迁移学习的知识,在结构中引入注意力机制以加强模型对于冻害特征的学习能力,并使用Focalloss作为损失函数以解决类别不平衡问题。为验证改进后模型的性能,使用高清摄像头采集高寒区隧道衬砌冻害图像,经过裁剪及数据增强等手段,建立一个包含2800张图像的冻害数据集。实验结果表明,改进后的模型在冻害数据集上的平均交并比(mean intersection over union,mIoU)可达到89.05%,相比原始模型提升了5.41%,在面对复杂形态冻害时展现出较好的鲁棒性,可直接应用于高分辨率原图;且在综合性能上优于DeeplabV3+、U-Net、PSPNet三种模型。所提方法可准确、安全地实现衬砌冻害智能检测,可为高寒区隧道智能化运维提供一定技术支持。