期刊文献+

LSN:Long-Term Spatio-Temporal Network for Video Recognition

原文传递
导出
摘要 Although recurrent neural networks(RNNs)are widely leveraged to process temporal or sequential data,they have attracted too little attention in current video action recognition applications.Therefore,this work attempts to model the long-term spatio-temporal information of the video based on a variant of RNN,i.e.,higher-order RNN.Moreover,we propose a novel long-term spatio-temporal network(LSN)for solving this video task,the core of which integrates the newly constructed high-order ConvLSTM(HO-ConvLSTM)modules with traditional 2D convolutional blocks.Specifically,each HO-ConvLSTM module consists of an accumulated temporary state(ATS)module as well as a standard ConvLSTM module,and several previous hidden states in the ATS module are accumulated to one temporary state that will enter the standard ConvLSTM to determine the output together with the current input.The HO-ConvLSTM module can be inserted into different stages of the 2D convolutional neural network(CNN)in a plug-andplay manner,thus well characterizing the long-term temporal evolution at various spatial resolutions.Experiment results on three commonly used video benchmarks demonstrate that the proposed LSN model can achieve competitive performance with the representative models.
出处 《国际计算机前沿大会会议论文集》 2022年第1期326-338,共13页 International Conference of Pioneering Computer Scientists, Engineers and Educators(ICPCSEE)
基金 supported by the National Natural Science Foundation of China (61972062,61902220) the Young and Middle-aged Talents Program of the National Civil Affairs Commission,and the University-Industry Collaborative Education Program (201902029013).
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部