摘要
人体行为识别由于行为的多样性和运动背景的复杂性等因素,具有很大的识别难度。为了充分利用视频序列中时空特征的多尺度信息,提高行为识别率,提出一种改进的双流卷积网络。以ResNet作为特征提取网络,并融合不同网络层次的特征,然后输入到长短时记忆网络中;最后将时间网络和空间网络的预测结果加权融合,从而实现行为识别。在公开数据集HMDB51上实验,时间网络和空间网络的识别率较原始双流网络均有2%以上的提高,整体的识别准确率可达67.2%,表明该方法能够有效提取视频序列中的时空信息,具有较好的识别效果。
Human behavior recognition is very difficult due to factors such as the diversity of behaviors and the complexity of the backgrounds.In order to make full use of the multi-scale information of the features in the video sequence and improve the behavior recognition rate,an improved two-stream convolutional network is proposed.Using ResNet as the feature extraction network,and merge the features of different network levels,and then input it into the long short term memory network,the prediction results of the time network and the space network are weighted and fused to achieve the recognition of behavior.The experimental results on the public dataset HMDB51is that both the recognition rate of the time network and the space network are improved by more than 2%compared with the originals,and the overall recognition rate can reach 67.2%,show that this method can effectively extract the information in the video sequence and has a better recognition effect.
作者
吕亚兰
安建伟
Lyu Yalan;An Jianwei(College of Computer and Communication Engineering,University of Science and Technology Beijing,Beijing 100083,China)
出处
《电子测量技术》
2020年第20期121-126,共6页
Electronic Measurement Technology
关键词
特征融合
双流卷积网络
行为识别
长短时记忆网络
features fusion
two-stream convolutional networks
behavior recognition
long short term memory