摘要
现有目标检测算法在复杂大场景下多目标检测的精度和实时性难以平衡,为此,受深度神经网络卷积核形态启发,模仿了人眼视觉机理,改进了基于深度学习的目标检测框架,即单向多框检测器(SSD),提出了多目标检测框架——自适应感知SSD,将其专用于复杂大交通场景多目标检测。设计了由多形态、彩色Gabor构成的特征卷积核库,训练筛选最优特征提取卷积核组替换原有网络的低级卷积核组,从而提高检测精度;将单图像检测框架与卷积长短期记忆网络结合,通过瓶颈-长短期记忆层提炼传播帧间的特征映射,实现网络帧级信息的时序关联,降低计算成本,从而实现对视频中受强干扰影响目标的追踪识别;同时加入自适应阈值策略,降低漏警率和虚警率。实验结果表明,相比于其他基于深度学习的目标检测框架,各类目标识别的平均准确率提高了9%~16%,平均准确率均值提高了14%~21%,多目标检测率提高了21%~36%,检测帧率达到32frame·s-1,实现了算法精度与实时性的平衡,取得较好的检测识别效果。
Aiming at the problem that the accuracy and real-time of multi-target detection in complex and large scenes are difficult to balance in the existing target detection algorithms,we imitate the human visual mechanism inspired by the convolution kernel shape of the deep neural network.The target detection framework——the single shot multi-box detection(SSD)based on deep learning is improved,and a multi-target detection framework adaptive perceive SSD is proposed,which is specially used for the multi-target detection in complex and large traffic scenes.A feature convolution kernel library composed of multi-form Gabor and color Gabor is designed.The optimal feature extraction convolution kernel group is trained and screened to replace the low-level convolution kernel group of the original network,and effectively improves the detection accuracy.A single image detection framework is combined with a convolution long-short-term memory network,and the temporal association of network frame-level information is realized by extracting the characteristic mapping between propagation frames with a bottleneck-longterm and short-term memory layer.And the calculation cost is reduced,and the tracking and identification of targets affected by the strong interference in the video are realized.An adaptive threshold strategy is added to reduce the rate of missing and false alarms.The experimental results show that compared with other target detection frameworks based on deep learning,the average accuracy of various target recognition is increased by 9%~16%,the average accuracy is increased by 14%~21%,the multi-target detection rate is increased by 21%~36%,and the detection frame rate reaches 32frame·s-1,which achieves a balance between the accuracy and real-time performance of the algorithm and achieves better detection and recognition results.
作者
华夏
王新晴
王东
马昭烨
邵发明
Hua Xia;Wang Xinqing;Wang Dong;Ma Zhaoye;Shao Faming(College of Field Engineering,PLA Army Engineering University,Nanjing,Jiangsu 210007,China;Second Institute of Engineering Research and Design,Southern Theatre Command,Kunming,Yunnan 650222,China)
出处
《光学学报》
EI
CAS
CSCD
北大核心
2018年第12期213-223,共11页
Acta Optica Sinica
基金
国家重点研发计划(2016YFC0802904)
国家自然科学基金(61671470)
江苏省自然科学基金(BK20161470)
中国博士后科学基金第62批面上资助项目(2017M623423)
关键词
机器视觉
生物视觉
深度学习
卷积神经网络
Gabor卷积核
递归神经网络
machine vision
biological vision
deep learning
convolution neural network
Gabor convolution kernel
recurrent neural network