Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providi...Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providing an important decision-making function for sustainable transportation.In order to provide a comprehensive and reasonable description of complex traffic scenes,a traffic scene semantic captioningmodel withmulti-stage feature enhancement is proposed in this paper.In general,the model follows an encoder-decoder structure.First,multilevel granularity visual features are used for feature enhancement during the encoding process,which enables the model to learn more detailed content in the traffic scene image.Second,the scene knowledge graph is applied to the decoding process,and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again,so that themodel can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions.This paper reports extensive experiments on the challenging MS-COCO dataset,evaluated by five standard automatic evaluation metrics,and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods,especially achieving a score of 129.0 on the CIDEr-D evaluation metric,which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.展开更多
With the intelligent development of road traffic control and management,higher requirements for the accuracy and effectiveness of traffic data have been put forward.The issue of how to collect and integrate data for t...With the intelligent development of road traffic control and management,higher requirements for the accuracy and effectiveness of traffic data have been put forward.The issue of how to collect and integrate data for traffic scenes has sought importance in this field as various treatment technologies have emerged.A lot of research work have been carried out from the theoretical aspect to engineering application.展开更多
With the rapid development of intelligent traffic information monitoring technology,accurate identification of vehicles,pedestrians and other objects on the road has become particularly important.Therefore,in order to...With the rapid development of intelligent traffic information monitoring technology,accurate identification of vehicles,pedestrians and other objects on the road has become particularly important.Therefore,in order to improve the recognition and classification accuracy of image objects in complex traffic scenes,this paper proposes a segmentation method of semantic redefine segmentation using image boundary region.First,we use the Seg Net semantic segmentation model to obtain the rough classification features of the vehicle road object,then use the simple linear iterative clustering(SLIC)algorithm to obtain the over segmented area of the image,which can determine the classification of each pixel in each super pixel area,and then optimize the target segmentation of the boundary and small areas in the vehicle road image.Finally,the edge recovery ability of condition random field(CRF)is used to refine the image boundary.The experimental results show that compared with FCN-8s and Seg Net,the pixel accuracy of the proposed algorithm in this paper improves by 2.33%and 0.57%,respectively.And compared with Unet,the algorithm in this paper performs better when dealing with multi-target segmentation.展开更多
Segmentation of moving objects in a video sequence is a basic task for application of computer vision. However, shadows extracted along with the objects can result in large errors in object localization and recognitio...Segmentation of moving objects in a video sequence is a basic task for application of computer vision. However, shadows extracted along with the objects can result in large errors in object localization and recognition. In this paper, we propose a method of moving shadow detection based on edge information, which can effectively detect the cast shadow of a moving vehicle in a traffic scene. Having confirmed shadows existing in a figure, we execute the shadow removal algorithm proposed in this paper to segment the shadow from the foreground. The shadow eliminating algorithm removes the boundary of the cast shadow and preserves object edges firstly; secondly, it reconstructs coarse object shapes based on the edge information of objects; and finally, it extracts the cast shadow by subtracting the moving object from the change detection mask and performs further processing. The proposed method has been further tested on images taken under different shadow orientations, vehicle colors and vehicle sizes, and the results have revealed that shadows can be successfully eliminated and thus good video segmentation can be obtained.展开更多
自动驾驶场景下的目标检测是计算机视觉中重要研究方向之一,确保自动驾驶汽车对物体进行实时准确的目标检测是研究重点。近年来,深度学习技术迅速发展并被广泛应用于自动驾驶领域中,极大促进了自动驾驶领域的进步。为此,针对YOLO(You On...自动驾驶场景下的目标检测是计算机视觉中重要研究方向之一,确保自动驾驶汽车对物体进行实时准确的目标检测是研究重点。近年来,深度学习技术迅速发展并被广泛应用于自动驾驶领域中,极大促进了自动驾驶领域的进步。为此,针对YOLO(You Only Look Once)算法在自动驾驶领域中的目标检测研究现状,从以下4个方面分析。首先,总结单阶段YOLO系列检测算法思想及其改进方法,分析YOLO系列算法的优缺点;其次,论述YOLO算法在自动驾驶场景下目标检测中的应用,从交通车辆、行人和交通信号识别这3个方面分别阐述和总结研究现状及应用情况;此外,总结目标检测中常用的评价指标、目标检测数据集和自动驾驶场景数据集;最后,展望目标检测存在的问题和未来发展方向。展开更多
基金funded by(i)Natural Science Foundation China(NSFC)under Grant Nos.61402397,61263043,61562093 and 61663046(ii)Open Foundation of Key Laboratory in Software Engineering of Yunnan Province:No.2020SE304.(iii)Practical Innovation Project of Yunnan University,Project Nos.2021z34,2021y128 and 2021y129.
文摘Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providing an important decision-making function for sustainable transportation.In order to provide a comprehensive and reasonable description of complex traffic scenes,a traffic scene semantic captioningmodel withmulti-stage feature enhancement is proposed in this paper.In general,the model follows an encoder-decoder structure.First,multilevel granularity visual features are used for feature enhancement during the encoding process,which enables the model to learn more detailed content in the traffic scene image.Second,the scene knowledge graph is applied to the decoding process,and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again,so that themodel can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions.This paper reports extensive experiments on the challenging MS-COCO dataset,evaluated by five standard automatic evaluation metrics,and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods,especially achieving a score of 129.0 on the CIDEr-D evaluation metric,which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.
文摘With the intelligent development of road traffic control and management,higher requirements for the accuracy and effectiveness of traffic data have been put forward.The issue of how to collect and integrate data for traffic scenes has sought importance in this field as various treatment technologies have emerged.A lot of research work have been carried out from the theoretical aspect to engineering application.
基金Supported in part by the Shaanxi Natural Science Basic Research Program(2022JM-298)the National Natural Science Foundation of China(52172324)+1 种基金Shaanxi Provincial Key Research and Development Program(2021SF-483)the Science and Technology Project of Shaan Provincal Transportation Department(21-202K,20-38T)。
文摘With the rapid development of intelligent traffic information monitoring technology,accurate identification of vehicles,pedestrians and other objects on the road has become particularly important.Therefore,in order to improve the recognition and classification accuracy of image objects in complex traffic scenes,this paper proposes a segmentation method of semantic redefine segmentation using image boundary region.First,we use the Seg Net semantic segmentation model to obtain the rough classification features of the vehicle road object,then use the simple linear iterative clustering(SLIC)algorithm to obtain the over segmented area of the image,which can determine the classification of each pixel in each super pixel area,and then optimize the target segmentation of the boundary and small areas in the vehicle road image.Finally,the edge recovery ability of condition random field(CRF)is used to refine the image boundary.The experimental results show that compared with FCN-8s and Seg Net,the pixel accuracy of the proposed algorithm in this paper improves by 2.33%and 0.57%,respectively.And compared with Unet,the algorithm in this paper performs better when dealing with multi-target segmentation.
基金The work was supported by the National Natural Science Foundation of PRC (No.60574033)the National Key Fundamental Research & Development Programs(973)of PRC (No.2001CB309403)
文摘Segmentation of moving objects in a video sequence is a basic task for application of computer vision. However, shadows extracted along with the objects can result in large errors in object localization and recognition. In this paper, we propose a method of moving shadow detection based on edge information, which can effectively detect the cast shadow of a moving vehicle in a traffic scene. Having confirmed shadows existing in a figure, we execute the shadow removal algorithm proposed in this paper to segment the shadow from the foreground. The shadow eliminating algorithm removes the boundary of the cast shadow and preserves object edges firstly; secondly, it reconstructs coarse object shapes based on the edge information of objects; and finally, it extracts the cast shadow by subtracting the moving object from the change detection mask and performs further processing. The proposed method has been further tested on images taken under different shadow orientations, vehicle colors and vehicle sizes, and the results have revealed that shadows can be successfully eliminated and thus good video segmentation can be obtained.
文摘自动驾驶场景下的目标检测是计算机视觉中重要研究方向之一,确保自动驾驶汽车对物体进行实时准确的目标检测是研究重点。近年来,深度学习技术迅速发展并被广泛应用于自动驾驶领域中,极大促进了自动驾驶领域的进步。为此,针对YOLO(You Only Look Once)算法在自动驾驶领域中的目标检测研究现状,从以下4个方面分析。首先,总结单阶段YOLO系列检测算法思想及其改进方法,分析YOLO系列算法的优缺点;其次,论述YOLO算法在自动驾驶场景下目标检测中的应用,从交通车辆、行人和交通信号识别这3个方面分别阐述和总结研究现状及应用情况;此外,总结目标检测中常用的评价指标、目标检测数据集和自动驾驶场景数据集;最后,展望目标检测存在的问题和未来发展方向。