To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentat...To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentation,lane detection,and traffic object detection.Firstly,in the encoding stage,features are extracted,and Generalized Efficient Layer Aggregation Network(GELAN)is utilized to enhance feature extraction and gradient flow.Secondly,in the decoding stage,specialized detection heads are designed;the drivable area segmentation head employs DySample to expand feature maps,the lane detection head merges early-stage features and processes the output through the Focal Modulation Network(FMN).Lastly,the Minimum Point Distance IoU(MPDIoU)loss function is employed to compute the matching degree between traffic object detection boxes and predicted boxes,facilitating model training adjustments.Experimental results on the BDD100K dataset demonstrate that the proposed network achieves a drivable area segmentation mean intersection over union(mIoU)of 92.2%,lane detection accuracy and intersection over union(IoU)of 75.3%and 26.4%,respectively,and traffic object detection recall and mAP of 89.7%and 78.2%,respectively.The detection performance surpasses that of other single-task or multi-task algorithm models.展开更多
The randomness and complexity of urban traffic scenes make it a difficult task for self-driving cars to detect drivable areas, Inspired by human driving behaviors, we propose a novel method of drivable area detection ...The randomness and complexity of urban traffic scenes make it a difficult task for self-driving cars to detect drivable areas, Inspired by human driving behaviors, we propose a novel method of drivable area detection for self-driving cars based on fusing pixel information from a monocular camera with spatial information from a light detection and ranging (LIDAR) scanner, Similar to the bijection of collineation, a new concept called co-point mapping, which is a bijection that maps points from the LIDAR scanner to points on the edge of the image segmentation, is introduced in the proposed method, Our method posi- tions candidate drivable areas through self-learning models based on the initial drivable areas that are obtained by fusing obstacle information with superpixels, In addition, a fusion of four features is applied in order to achieve a more robust performance, In particular, a feature called drivable degree (DD) is pro- posed to characterize the drivable degree of the LIDAR points, After the initial drivable area is characterized by the features obtained through self-learning, a Bayesian framework is utilized to calculate the final probability map of the drivable area, Our approach introduces no common hypothesis and requires no training steps; yet it yields a state-of-art performance when tested on the ROAD-KITTI benchmark, Experimental results demonstrate that the proposed method is a general and efficient approach for detecting drivable area,展开更多
A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panopti...A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.展开更多
Driving space for autonomous vehicles(AVs)is a simplified representation of real driving environments that helps facilitate driving decision processes.Existing literatures present numerous methods for constructing dri...Driving space for autonomous vehicles(AVs)is a simplified representation of real driving environments that helps facilitate driving decision processes.Existing literatures present numerous methods for constructing driving spaces,which is a fundamental step in AV development.This study reviews the existing researches to gain a more systematic understanding of driving space and focuses on two questions:how to reconstruct the driving environment,and how to make driving decisions within the constructed driving space.Furthermore,the advantages and disadvantages of different types of driving space are analyzed.The study provides further understanding of the relationship between perception and decision-making and gives insight into direction of future research on driving space of AVs.展开更多
文摘To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentation,lane detection,and traffic object detection.Firstly,in the encoding stage,features are extracted,and Generalized Efficient Layer Aggregation Network(GELAN)is utilized to enhance feature extraction and gradient flow.Secondly,in the decoding stage,specialized detection heads are designed;the drivable area segmentation head employs DySample to expand feature maps,the lane detection head merges early-stage features and processes the output through the Focal Modulation Network(FMN).Lastly,the Minimum Point Distance IoU(MPDIoU)loss function is employed to compute the matching degree between traffic object detection boxes and predicted boxes,facilitating model training adjustments.Experimental results on the BDD100K dataset demonstrate that the proposed network achieves a drivable area segmentation mean intersection over union(mIoU)of 92.2%,lane detection accuracy and intersection over union(IoU)of 75.3%and 26.4%,respectively,and traffic object detection recall and mAP of 89.7%and 78.2%,respectively.The detection performance surpasses that of other single-task or multi-task algorithm models.
基金This research was partially supported by the National Natural Science Foundation of China (61773312), the National Key Research and Development Plan (2017YFC0803905), and the Program of Introducing Talents of Discipline to University (B13043).
文摘The randomness and complexity of urban traffic scenes make it a difficult task for self-driving cars to detect drivable areas, Inspired by human driving behaviors, we propose a novel method of drivable area detection for self-driving cars based on fusing pixel information from a monocular camera with spatial information from a light detection and ranging (LIDAR) scanner, Similar to the bijection of collineation, a new concept called co-point mapping, which is a bijection that maps points from the LIDAR scanner to points on the edge of the image segmentation, is introduced in the proposed method, Our method posi- tions candidate drivable areas through self-learning models based on the initial drivable areas that are obtained by fusing obstacle information with superpixels, In addition, a fusion of four features is applied in order to achieve a more robust performance, In particular, a feature called drivable degree (DD) is pro- posed to characterize the drivable degree of the LIDAR points, After the initial drivable area is characterized by the features obtained through self-learning, a Bayesian framework is utilized to calculate the final probability map of the drivable area, Our approach introduces no common hypothesis and requires no training steps; yet it yields a state-of-art performance when tested on the ROAD-KITTI benchmark, Experimental results demonstrate that the proposed method is a general and efficient approach for detecting drivable area,
基金supported by National Natural Science Foundation of China(Nos.61876212 and 1733007)Zhejiang Laboratory,China(No.2019NB0AB02)Hubei Province College Students Innovation and Entrepreneurship Training Program,China(No.S202010487058).
文摘A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.
基金This work was supported in part by the National Natural Science Foundation of China(Grant No.U1864203)in part by the International Science,and Technology Cooperation Program of China(No.2016YFE0102200).
文摘Driving space for autonomous vehicles(AVs)is a simplified representation of real driving environments that helps facilitate driving decision processes.Existing literatures present numerous methods for constructing driving spaces,which is a fundamental step in AV development.This study reviews the existing researches to gain a more systematic understanding of driving space and focuses on two questions:how to reconstruct the driving environment,and how to make driving decisions within the constructed driving space.Furthermore,the advantages and disadvantages of different types of driving space are analyzed.The study provides further understanding of the relationship between perception and decision-making and gives insight into direction of future research on driving space of AVs.