现有优秀的基于深度学习的分布式视频压缩感知(Distributed Compressed Video Sensing,DCVS)重构算法利用测量值和参考帧顺序更新非关键帧,获得了较好的重构性能,但由于缺乏较严格的理论指导,无法充分结合这两类信息,限制了非关键帧重...现有优秀的基于深度学习的分布式视频压缩感知(Distributed Compressed Video Sensing,DCVS)重构算法利用测量值和参考帧顺序更新非关键帧,获得了较好的重构性能,但由于缺乏较严格的理论指导,无法充分结合这两类信息,限制了非关键帧重构质量的进一步提升.针对该问题,本文首先利用贝叶斯理论及最大后验概率(Maximum A Posteriori,MAP)估计推导出DCVS中非关键帧重构的优化方程,再基于近端梯度算法推导出优化方程的求解框架,包含多信息流梯度更新聚合方程.基于此,本文设计了多信息流梯度更新及聚合模块(Multi-Information flow Gradient update and Aggregation,MIGA),并构建了深度多信息流梯度更新与聚合网络(Deep Multi-Information flow Gradient update and Aggregation Network,DMIGAN)用于DCVS非关键帧重构.MIGA利用测量值与多参考帧对当前非关键帧进行并行梯度更新,再做信息交互融合,从而充分结合多种信息流更新重构帧.本文级联MIGA与去噪子网络用于模拟近端梯度算法的单次迭代,作为基础模块(phase),并通过级联多个phase构造深度重构网络DMIGAN,实现帧重构的深度优化过程.实验表明,DMIGAN与具代表性的传统迭代优化算法结构相似的帧间组稀疏表示重构算法(Structural SIMilarity based Inter-Frame Group Sparse Representation,SSIM-Inter F-GSR)相比,在低采样率与高采样率下性能分别提升了8.8 dB和7.36 dB;和具有代表性的深度学习重构算法VCSNet-2相比,在低采样率和高采样率下性能分别提升了7.09 dB和8.78 dB.展开更多
This paper presents a human action recognition method. It analyzes the spatio-temporal grids along the dense trajectories and generates the histogram of oriented gradients (HOG) and histogram of optical flow (HOF)...This paper presents a human action recognition method. It analyzes the spatio-temporal grids along the dense trajectories and generates the histogram of oriented gradients (HOG) and histogram of optical flow (HOF) to describe the appearance and motion of the human object. Then, HOG combined with HOF is converted to bag-of-words (BoWs) by the vocabulary tree. Finally, it applies random forest to recognize the type of human action. In the experiments, KTH database and URADL database are tested for the performance evaluation. Comparing with the other approaches, we show that our approach has a better performance for the action videos with high inter-class and low inter-class variabilities.展开更多
文摘现有优秀的基于深度学习的分布式视频压缩感知(Distributed Compressed Video Sensing,DCVS)重构算法利用测量值和参考帧顺序更新非关键帧,获得了较好的重构性能,但由于缺乏较严格的理论指导,无法充分结合这两类信息,限制了非关键帧重构质量的进一步提升.针对该问题,本文首先利用贝叶斯理论及最大后验概率(Maximum A Posteriori,MAP)估计推导出DCVS中非关键帧重构的优化方程,再基于近端梯度算法推导出优化方程的求解框架,包含多信息流梯度更新聚合方程.基于此,本文设计了多信息流梯度更新及聚合模块(Multi-Information flow Gradient update and Aggregation,MIGA),并构建了深度多信息流梯度更新与聚合网络(Deep Multi-Information flow Gradient update and Aggregation Network,DMIGAN)用于DCVS非关键帧重构.MIGA利用测量值与多参考帧对当前非关键帧进行并行梯度更新,再做信息交互融合,从而充分结合多种信息流更新重构帧.本文级联MIGA与去噪子网络用于模拟近端梯度算法的单次迭代,作为基础模块(phase),并通过级联多个phase构造深度重构网络DMIGAN,实现帧重构的深度优化过程.实验表明,DMIGAN与具代表性的传统迭代优化算法结构相似的帧间组稀疏表示重构算法(Structural SIMilarity based Inter-Frame Group Sparse Representation,SSIM-Inter F-GSR)相比,在低采样率与高采样率下性能分别提升了8.8 dB和7.36 dB;和具有代表性的深度学习重构算法VCSNet-2相比,在低采样率和高采样率下性能分别提升了7.09 dB和8.78 dB.
基金supported by the MOST,Taiwan under Grant No.102-2221-E-468-013
文摘This paper presents a human action recognition method. It analyzes the spatio-temporal grids along the dense trajectories and generates the histogram of oriented gradients (HOG) and histogram of optical flow (HOF) to describe the appearance and motion of the human object. Then, HOG combined with HOF is converted to bag-of-words (BoWs) by the vocabulary tree. Finally, it applies random forest to recognize the type of human action. In the experiments, KTH database and URADL database are tested for the performance evaluation. Comparing with the other approaches, we show that our approach has a better performance for the action videos with high inter-class and low inter-class variabilities.