摘要
为实现在低功耗嵌入式设备上部署视频车辆目标检测模型,提出一种基于时序信息的轻量级视频车辆目标检测方法。该方法以SSD网络为基础,使用MobileNetV3-Small替换原主干特征提取网络VGG-16,并在SSD网络中直接注入注意力机制卷积GRU用于融合时序信息,提升车辆检测精度;关键帧检测网络控制的跳跃连接使模型只在关键帧更新GRU状态,非关键帧直接复制上一关键帧GRU状态,提升模型检测速度。为进一步减少计算量,网络中大量使用深度可分离卷积替换标准卷积层,同时使用量化感知训练方法压缩模型。在UA-DETRAC数据集上的实验表明,该方法在Intel Core i7 CPU和树莓派4B上平均每帧检测时间分别为18 ms和134 ms,准确率达到了较高水平,为78.81%。
To operate a video vehicle object detection model on low⁃power embedded devices,a lightweight video vehicle object detection method based on temporal information is proposed.This method is based on the SSD network,uses MobileNetV3-Small to replace the original backbone feature extraction network VGG-16,and directly injects attention mechanism convolution GRU into the network to fuse temporal information to improve vehicle detection accuracy;keyframe detection network control the skip connection allows the model to update the GRU state only at key frames,and directly copy the GRU state of the previous key frame for non⁃key frames to improve the model detection speed.To further reduce the amount of computation,a large number of depthwise separable convolutions are used in the network to replace standard convolutional layers.Also,quantization⁃aware training is used to reduce accuracy loss during quantization and compress model size.Experiments on the UA-DETRAC dataset show that the average detection time per frame is 18 ms and 134 ms on Intel Core i7 CPU and Raspberry Pi 4B,and the accuracy reaches a high level of 78.81%.
作者
符广
刘彦隆
刘建霞
FU Guang;LIU Yanlong;LIU Jianxia(College of Information and Computer,Taiyuan University of Technology,Jinzhong 030600,China)
出处
《电子设计工程》
2024年第1期175-180,186,共7页
Electronic Design Engineering
基金
太原理工大学项目资助(9002-03011843)。
关键词
视频目标检测
时序信息融合
自适应关键帧
量化感知训练
video object detection
temporal information fusion
adaptive keyframes
quantization⁃aware training