摘要
暴力行为识别是视频行为识别领域研究的一个研究方向。随着深度学习的发展,视频行为识别在过去十年里取得了显著的进步,但也面临了新的挑战,例如如何有效地建模视频中的长时序信息、如何降低计算开销等。针对此问题,引入了一种基于软注意力机制和深度可分离卷积LSTM的深度学习网络模型,称为Att-SepConvLSTM,首先利用轻量级的NasNetMobile网络进行视频帧的空间特征提取,然后将空间特征图依次输入进去,得到全局时序特征,最后经过分类层输出是否存在暴力行为的二分类结果。
Violent action recognition is a research direction in the field of video action recognition.With the development of deep learning,video action recognition has made significant progress in the past decade,but also faced new challenges,such as how to effectively model the long-term temporal information in videos,how to reduce the computational cost,etc.To address this problem,a deep learning network model based on soft attention mechanism and depthwise separable convolution LSTM,called Att-SepConvLSTM,is introduced.It first uses a lightweight NasNetMobile network to extract the spatial features of video frames,then inputs the spatial feature maps sequentially,obtaining the global temporal features,and finally outputs whether there is violent action or not through a classification layer.
出处
《工业控制计算机》
2024年第8期107-109,共3页
Industrial Control Computer
关键词
暴力行为识别
深度学习
注意力机制
深度可分离卷积LSTM
violent action recognition
deep learning
soft attention mechanism
depthwise separable convolution LSTM