期刊文献+

基于多尺度多列卷积神经网络的密集人群计数模型 被引量:9

Crowd counting model based on multi-scale multi-column convolutional neural network
下载PDF
导出
摘要 针对尺度和视角变化导致的监控视频和图像中的人数估计性能差的问题,提出了一种基于多尺度多列卷积神经网络(MsMCNN)的密集人群计数模型。在使用MsMCNN进行特征提取之前,使用高斯滤波器对数据集进行处理得到图像的真实密度图,并且对数据集进行数据增强。MsMCNN以多列卷积神经网络的结构为主干,首先从具有多尺度的多个列中提取特征图;然后,用MsMCNN在同一列上连接具有相同分辨率的特征图,以生成图像的估计密度图;最后,对估计密度图进行积分来完成人群计数的任务。为了验证所提模型的有效性,在Shanghaitech数据集和UCFCC50数据集上进行了实验,与经典模型Crowdnet、多列卷积神经网络(MCNN)、级联多任务学习(CMTL)方法、尺度自适应卷积神经网络(SaCNN)相比,所提模型在Shanghaitech数据集PartA和UCFCC50数据集上平均绝对误差(MAE)分别至少减小了10.6和24.5,均方误差(MSE)分别至少减小了1.8和29.3;在Shanghaitech数据集PartB上也取得了较好的结果。MsMCNN更注重特征提取过程中的浅层特征的结合以及多尺度特征的结合,可以有效减少尺度和视角变化带来的精确度偏低的影响,提升人群计数的性能。 To improve the bad performance of crowd counting in surveillance videos and images caused by the scale and perspective variation, a crowd counting model, named Multi-scale Multi-column Convolutional Neural Network(MsMCNN) was proposed. Before extracting features with MsMCNN, the dataset was processed with the Gaussian filter to obtain the true density maps of images, and the data augmentation was performed. With the structure of multi-column convolutional neural network as the backbone, MsMCNN firstly extracted feature maps from multiple columns with multiple scales. Then, MsMCNN was used to generate the estimated density map by combining feature maps with the same resolution in the same column. Finally, crowd counting was realized by integrating the estimated density map. To verify the effectiveness of the proposed model, experiments were conducted on Shanghaitech and UCFCC50 datasets. Compared to the classic methods: Crowdnet, Multi-column Convolutional Neural Network(MCNN), Cascaded Multi-Task Learning(CMTL) and Scale-adaptive Convolutional Neural Network(SaCNN), the Mean Absolute Error(MAE) of MsMCNN respectively decreases 10.6 and 24.5 at least on PartA and UCFCC50 of Shanghaitech dataset, and the Mean Squared Error(MSE) of MsMCNN respectively decreases 1.8 and 29.3 at least. Furthermore, MsMCNN also achieves the better result on the PartB of the Shanghaitech dataset. MsMCNN pays more attention to the combination of shallow features and the combination of multi-scale features in the feature extraction process, which can effectively reduce the impact of low accuracy caused by scale and perspective variation, and improve the performance of crowd counting.
作者 陆金刚 张莉 LU Jingang;ZHANG Li(School of Computer Science and Technology,Soochow University,Suzhou Jiangsu 215006,China;Jiangsu Provincial Key Laboratory for Computer Information Processing Technology(Soochow University),Suzhou Jiangsu 215006,China)
出处 《计算机应用》 CSCD 北大核心 2019年第12期3445-3449,共5页 journal of Computer Applications
基金 江苏省“六大人才高峰”高层次人才项目(XYDXX-054)~~
关键词 密集人群计数 密度图 卷积神经网络 多尺度 尺度和视角变化 crowd counting density map Convolutional Neural Network(CNN) multi-scale perspective and scale variation
  • 相关文献

参考文献3

二级参考文献13

  • 1VELASTIN S A , BOGHOSSIAN B A , LOB P L , et al. PRISMATICA: toward ambient intelligence in public transport environments [J]. IEEE Trans on Systems, Man and Cybernetics, Part A, 2005,35 (1) : 164 -182.
  • 2VIOLA P, JONES M J, SNOW D. Detecting pedestrians using patterns of motion and appearance [ C ]//Proc of the 9th IEEE International Conference on Computer Vision. Washington DC : IEEE Computer Society, 2003: 734-741.
  • 3CHEN Li, TAO Ji, TAN Ya-peng, et al. People counting using iterative mean-shift fitting with symmetry measure[ C]// Proc of the 6th International Conference on Information and Communication Security. 2007 : 890-895.
  • 4MARANA A N, CAVENAGHI M A , ULSON R S ,et al. Real-time crowd density estimation using images [ C ]// Proc of International Symposium on Visual Computing. Berlin : Springer, 2005 : 355 -362.
  • 5BOZZOLI M, CINQUE L. A statistical method for people counting in crowded environments [ C ]// Proc of the 14th International Conference on Image Analysis and Processing. Washington DC:IEEE Computer Society, 2007 : 506-511.
  • 6VELASTIN S A , YIN J H , DAVIES A C ,et al. Automated measurement of crowd density and motion using image processing[ C ]// Proc of the 7th International Conference on Road Traffic Monitoring and Control. London,UK: [ s. n.], 1994:127-132.
  • 7BOGHOSSAN B A , VELASTIN S A. Motion-based machine vision techniques for the management of large crowds[ C]//Proc of the 6th IEEE International Conference on Electronics, Circuits and Systems. 1999:961-964.
  • 8MARANA A N , VELASTIN S A , COSTA L F , et al. Estimation of crowd density using image processing [ C ]// Proc of IEE Colloquium on Image Processing for Security Applications. London, UK : [ s. n. ], 1997 : 1- 8.
  • 9MARANA A N , VELASTIN S A , COSTA L F , et al. Automatic estimation of crowd density using texture [ J ]. Safety Science, 1998,28(3) :165-175.
  • 10MARANA A, Da COSTA L , LOTUFO ll,et al. On the efficacy of texture analysis for crowd monitoring [C]// Proc of Computer Graphics, Image Processing, and Vision. Washingtom DC : IEEE Computer Society, 1998:354-361.

共引文献18

同被引文献39

引证文献9

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部