摘要
由于野外监控传感系统中背景干扰较多、小目标像素点较少以及缺乏相关公开数据集等因素,在野外复杂环境中检测运动小目标仍然是国防军事应用中一个具有挑战性的问题。针对这一问题,提出一种基于YOLOv5改进的双帧融合目标检测网络(YOLO-DFNet)。首先,提出双帧融合模块用来处理骨干网络输出的相邻帧特征,通过计算通道及时间维度的注意力和空间注意力,提取运动特征;其次,在颈部网络与检测头之间设计一个时间梯形融合网络,关注不同大小感受野上的运动目标,改善大位移小目标的检测效果。在野外运动小目标数据集FMSOD上的实验结果表明:YOLO-DFNet在不同IoU上的平均精度比YOLOv5算法提高3.9个百分点,同时也优于TPH-YOLOv5、YOLOv7等其他目标检测网络。
Detecting dynamic small objects in complex environments in the field remains a challenging problem for defense and military applications due to factors such as more background interference in the field surveillance sensing systems,fewer pixels of small targets,and the lack of relevant open datasets.In order to solve this problem,a YOLOv5-based object detection network with double frame feature fusion(YOLO-DFNet)is proposed.Firstly,a double frame feature fusion module(D-F fusion)is introduced to process the adjacent frame features from the backbone network,calculating attention in channel,time,and space dimensions successively,to extract motion features.Secondly,a temporal trapezoidal fusion network based on an attention mechanism(TTFN_AM)is designed between the neck network and the detection head to focus on dynamic objects within receptive fields of different sizes,thereby improving the detection effect of small objects with large displacement.The experimental results on field motion small object dataset(FMSOD)show that the mean average precision(mAP)on different IoUs of the proposed YOLO-DFNet is 3.9 percentage points%higher than that of YOLOv5,and also outperforms other object detection models such as Tph-YOLOv5 and YOLOv7.
作者
赵筱晗
张泽斌
李宝清
ZHAO Xiaohan;ZHANG Zebin;LI Baoqing(Key Laboratory of Microsystem Technology,Shanghai Institute of Microsystem and Information Technology,Chinese Academy of Sciences,Shanghai 201800,China;University of Chinese Academy of Sciences,Beijing 100049,China)
出处
《中国科学院大学学报(中英文)》
CAS
CSCD
北大核心
2024年第6期810-820,共11页
Journal of University of Chinese Academy of Sciences
基金
中国科学院微系统与信息技术研究所微系统技术重点实验室基金(6142804220102)资助。
关键词
目标检测
野外监控传感网
运动小目标
双帧融合
时空注意力
object detection
field monitoring sensor network
dynamic small object
double-frame feature fusion
spatial-temporal attention