The rapid development and progress in deep machine-learning techniques have become a key factor in solving the future challenges of humanity.Vision-based target detection and object classification have been improved d...The rapid development and progress in deep machine-learning techniques have become a key factor in solving the future challenges of humanity.Vision-based target detection and object classification have been improved due to the development of deep learning algorithms.Data fusion in autonomous driving is a fact and a prerequisite task of data preprocessing from multi-sensors that provide a precise,well-engineered,and complete detection of objects,scene or events.The target of the current study is to develop an in-vehicle information system to prevent or at least mitigate traffic issues related to parking detection and traffic congestion detection.In this study we examined to solve these problems described by(1)extracting region-of-interest in the images(2)vehicle detection based on instance segmentation,and(3)building deep learning model based on the key features obtained from input parking images.We build a deep machine learning algorithm that enables collecting real video-camera feeds from vision sensors and predicting free parking spaces.Image augmentation techniques were performed using edge detection,cropping,refined by rotating,thresholding,resizing,or color augment to predict the region of bounding boxes.A deep convolutional neural network F-MTCNN model is proposed that simultaneously capable for compiling,training,validating and testing on parking video frames through video-camera.The results of proposed model employing on publicly available PK-Lot parking dataset and the optimized model achieved a relatively higher accuracy 97.6%than previous reported methodologies.Moreover,this article presents mathematical and simulation results using state-of-the-art deep learning technologies for smart parking space detection.The results are verified using Python,TensorFlow,OpenCV computer simulation frameworks.展开更多
In this paper, we consider salient instance segmentation. As well as producing bounding boxes,our network also outputs high-quality instance-level segments as initial selections to indicate the regions of interest. Ta...In this paper, we consider salient instance segmentation. As well as producing bounding boxes,our network also outputs high-quality instance-level segments as initial selections to indicate the regions of interest. Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch. Our new branch regards not only local context inside each detection window but also the surrounding context, enabling us to distinguish instances in the same scope even with partial occlusion.Our network is end-to-end trainable and is fast(running at 40 fps for images with resolution 320 × 320). We evaluate our approach on a publicly available benchmark and show that it outperforms alternative solutions. We also provide a thorough analysis of our design choices to help readers better understand the function of each part of our network. Source code can be found at https://github.com/Ruochen Fan/S4 Net.展开更多
In this paper, we present a video coding scheme which applies the technique of visual saliency computation to adjust image fidelity before compression. To extract visually salient features, we construct a spatio-tempo...In this paper, we present a video coding scheme which applies the technique of visual saliency computation to adjust image fidelity before compression. To extract visually salient features, we construct a spatio-temporal saliency map by analyzing the video using a combined bottom-up and top-down visual saliency model. We then use an extended bilateral filter, in which the local intensity and spatial scales are adjusted according to visual saliency, to adaptively alter the image fidelity. Our implementation is based on the H.264 video encoder JM12.0. Besides evaluating our scheme with the H.264 reference software, we also compare it to a more traditional foreground-background segmentation-based method and a foveation-based approach which employs Gaussian blurring. Our results show that the proposed algorithm can improve the compression ratio significantly while effectively preserving perceptual visual quality.展开更多
文摘The rapid development and progress in deep machine-learning techniques have become a key factor in solving the future challenges of humanity.Vision-based target detection and object classification have been improved due to the development of deep learning algorithms.Data fusion in autonomous driving is a fact and a prerequisite task of data preprocessing from multi-sensors that provide a precise,well-engineered,and complete detection of objects,scene or events.The target of the current study is to develop an in-vehicle information system to prevent or at least mitigate traffic issues related to parking detection and traffic congestion detection.In this study we examined to solve these problems described by(1)extracting region-of-interest in the images(2)vehicle detection based on instance segmentation,and(3)building deep learning model based on the key features obtained from input parking images.We build a deep machine learning algorithm that enables collecting real video-camera feeds from vision sensors and predicting free parking spaces.Image augmentation techniques were performed using edge detection,cropping,refined by rotating,thresholding,resizing,or color augment to predict the region of bounding boxes.A deep convolutional neural network F-MTCNN model is proposed that simultaneously capable for compiling,training,validating and testing on parking video frames through video-camera.The results of proposed model employing on publicly available PK-Lot parking dataset and the optimized model achieved a relatively higher accuracy 97.6%than previous reported methodologies.Moreover,this article presents mathematical and simulation results using state-of-the-art deep learning technologies for smart parking space detection.The results are verified using Python,TensorFlow,OpenCV computer simulation frameworks.
基金supported by National Natural Science Foundation of China(61521002,61572264,61620106008)the National Youth Talent Support Program+1 种基金Tianjin Natural Science Foundation(17JCJQJC43700,18ZXZNGX00110)the Fundamental Research Funds for the Central Universities(Nankai University,No.63191501)。
文摘In this paper, we consider salient instance segmentation. As well as producing bounding boxes,our network also outputs high-quality instance-level segments as initial selections to indicate the regions of interest. Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch. Our new branch regards not only local context inside each detection window but also the surrounding context, enabling us to distinguish instances in the same scope even with partial occlusion.Our network is end-to-end trainable and is fast(running at 40 fps for images with resolution 320 × 320). We evaluate our approach on a publicly available benchmark and show that it outperforms alternative solutions. We also provide a thorough analysis of our design choices to help readers better understand the function of each part of our network. Source code can be found at https://github.com/Ruochen Fan/S4 Net.
基金supported partially by the National High-Tech Research and Development 863 Program of China under Grant No. 2009AA01Z330the National Natural Science Foundation of China under Grant Nos.61033012 and 60970100
文摘In this paper, we present a video coding scheme which applies the technique of visual saliency computation to adjust image fidelity before compression. To extract visually salient features, we construct a spatio-temporal saliency map by analyzing the video using a combined bottom-up and top-down visual saliency model. We then use an extended bilateral filter, in which the local intensity and spatial scales are adjusted according to visual saliency, to adaptively alter the image fidelity. Our implementation is based on the H.264 video encoder JM12.0. Besides evaluating our scheme with the H.264 reference software, we also compare it to a more traditional foreground-background segmentation-based method and a foveation-based approach which employs Gaussian blurring. Our results show that the proposed algorithm can improve the compression ratio significantly while effectively preserving perceptual visual quality.