摘要
以ResNet-50为骨干网络,设计了多尺度特征融合的视频显著性目标检测系统。系统主要由上下文语义聚合模块(用于空间特征的提取)和双层卷积LSTM模块(用于连续帧间时间相关性的提取)组成。该系统使用ResNet-50提取显著性目标特征,提出上下文语义聚合模块,利用自上而下和自下而上的结构融合深层特征和浅层特征,可以灵活调整不同层次特征的贡献,并进行有效的信息交换。然后,将空间细化的特征输入双层ConvLSTM来探索视频帧之间注意力的动态转换,以增强连续视频帧间运动信息的表达,实现视频帧间的时间相关性提取。3个标准数据集的仿真结果表明,与11种常用系统相比,该系统的评价指标MAE,F-measure,S-measure均较好。
Using ResNet-50 as the backbone network,the comprehensive analysis and implementation of multi-scale feature fusion in video saliency object detection system are explored.The system is mainly divided into two modules.The context semantic aggregation module is used to extract spatial features,and the double-layer convolutional LSTM module is used to extract time correlation between continuous frames.The system uses ResNet-50 to extract salient object features and proposes a context semantic aggregation module,which uses top-down and bottom-up structures to fuse deep features and shallow features,flexibly adjusts the contributions of different levels of features,and carries out effective information exchange.Then,the spatial thinning feature is input into a double-layer ConvLSTM to explore the dynamic conversion of attention between video frames,so as to enhance the expression of motion information between continuous video frames and realize the extraction of time correlation between video frames.Simulation results in three standard data sets show that MAE,F-Measure and S-Measure of the system have better performance compared with 11 most advanced systems.
作者
毕洪波
朱徽徽
杨丽娜
张丛
吴然万
BI Hongbo;ZHU Huihui;YANG Lina;ZHANG Cong;WU Ranwan(School of Electrical and Information Engineering,Northeast Petroleum University,Daqing 163318,Heilongjiang,China)
出处
《实验室研究与探索》
CAS
北大核心
2022年第3期94-98,共5页
Research and Exploration In Laboratory
关键词
目标检测
特征提取
特征融合
注意力机制
object detection
feature extraction
future fusion
attention mechanism