To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentat...To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentation,lane detection,and traffic object detection.Firstly,in the encoding stage,features are extracted,and Generalized Efficient Layer Aggregation Network(GELAN)is utilized to enhance feature extraction and gradient flow.Secondly,in the decoding stage,specialized detection heads are designed;the drivable area segmentation head employs DySample to expand feature maps,the lane detection head merges early-stage features and processes the output through the Focal Modulation Network(FMN).Lastly,the Minimum Point Distance IoU(MPDIoU)loss function is employed to compute the matching degree between traffic object detection boxes and predicted boxes,facilitating model training adjustments.Experimental results on the BDD100K dataset demonstrate that the proposed network achieves a drivable area segmentation mean intersection over union(mIoU)of 92.2%,lane detection accuracy and intersection over union(IoU)of 75.3%and 26.4%,respectively,and traffic object detection recall and mAP of 89.7%and 78.2%,respectively.The detection performance surpasses that of other single-task or multi-task algorithm models.展开更多
A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panopti...A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.展开更多
文摘To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentation,lane detection,and traffic object detection.Firstly,in the encoding stage,features are extracted,and Generalized Efficient Layer Aggregation Network(GELAN)is utilized to enhance feature extraction and gradient flow.Secondly,in the decoding stage,specialized detection heads are designed;the drivable area segmentation head employs DySample to expand feature maps,the lane detection head merges early-stage features and processes the output through the Focal Modulation Network(FMN).Lastly,the Minimum Point Distance IoU(MPDIoU)loss function is employed to compute the matching degree between traffic object detection boxes and predicted boxes,facilitating model training adjustments.Experimental results on the BDD100K dataset demonstrate that the proposed network achieves a drivable area segmentation mean intersection over union(mIoU)of 92.2%,lane detection accuracy and intersection over union(IoU)of 75.3%and 26.4%,respectively,and traffic object detection recall and mAP of 89.7%and 78.2%,respectively.The detection performance surpasses that of other single-task or multi-task algorithm models.
基金supported by National Natural Science Foundation of China(Nos.61876212 and 1733007)Zhejiang Laboratory,China(No.2019NB0AB02)Hubei Province College Students Innovation and Entrepreneurship Training Program,China(No.S202010487058).
文摘A panoptic driving perception system is an essential part of autonomous driving.A high-precision and real-time perception system can assist the vehicle in making reasonable decisions while driving.We present a panoptic driving perception network(you only look once for panoptic(YOLOP))to perform traffic object detection,drivable area segmentation,and lane detection simultaneously.It is composed of one encoder for feature extraction and three decoders to handle the specific tasks.Our model performs extremely well on the challenging BDD100K dataset,achieving state-of-the-art on all three tasks in terms of accuracy and speed.Besides,we verify the effectiveness of our multi-task learning model for joint training via ablative studies.To our best knowledge,this is the first work that can process these three visual perception tasks simultaneously in real-time on an embedded device Jetson TX2(23 FPS),and maintain excellent accuracy.To facilitate further research,the source codes and pre-trained models are released at https://github.com/hustvl/YOLOP.