期刊文献+

基于双重注意力机制的人群计数方法

Crowd counting method based on dual attention mechanism
下载PDF
导出
摘要 针对复杂场景下人群计数问题中的尺度变化、背景干扰和部分遮挡等问题,在空洞卷积操作的基础上,提出一种基于双重注意力机制的空洞上下文卷积神经网络(DA-DCCNN)。首先,将VGG16中的卷积层作为特征提取器,获取人群图像抽象、深层的特征图;其次,利用空洞卷积构造空洞上下文模块(DCM)对不同层获取的特征进行连接,并引入空间注意力模块(SAM)和通道注意力模块(CAM)获取上下文信息;最后,组合欧氏距离和交叉熵构造损失函数,对网络预测注意力图和真实注意力图之间的差异进行度量。在ShanghaiTech、UCF_CC_50和UCF-QNRF 3个公开数据集上的实验结果表明,DA-DCCNN在有效获取图像的多尺度特征的同时,增强了对图像中重要区域和通道的感知能力,平均绝对误差(MAE)取得了相对最优的结果。基于双重注意力机制的特征融合网络能有效感知图像中的空间结构和局部特征,从而使得生成的密度图能更准确地对人群区域进行预测和计数。 In response to challenges such as scale variation,background interference,and partial occlusion in crowd counting within complex scenes,a DA-DCCNN(Dual Attention based Dilated Contextual Convolutional Neural Network)was proposed.Firstly,the convolutional layers from VGG16 were utilized as feature extractors to obtain abstract and deeplevel feature maps of the crowd image.Subsequently,by employing dilated convolutions,a Dilated Context Module(DCM)was constructed to connect features obtained from different layers.The Spatial Attention Module(SAM)and Channel Attention Module(CAM)were introduced to acquire contextual information.Finally,a loss function was formulated by combining the Euclidean distance and cross entropy to measure the disparity between the predicted attention map and the ground truth attention map.Experimental results on three publicly available datasets—ShanghaiTech,UCF_CC_50 and UCF-QNRF demonstrate that DA-DCCNN can effectively capture multi-scale features in the image and enhance the perception of important regions and channels within the image,achieving the optimal Mean Absolute Error(MAE).The feature fusion network based on dual attention mechanism can efficiently recognize spatial structures and local features in images so that by using the generated density maps,the crowd regions can be predicted and counted more accurately.
作者 赵志强 马培红 黑新宏 ZHAO Zhiqiang;MA Peihong;HEI Xinhong(School of Computer Science and Engineering,Xi’an University of Technology,Xi’an Shaanxi 710048,China;Shaanxi Key Laboratory of Network Computing and Security Technology(Xi’an University of Technology),Xi’an Shaanxi 710048,China)
出处 《计算机应用》 CSCD 北大核心 2024年第9期2886-2892,共7页 journal of Computer Applications
基金 国家自然科学基金资助项目(61976177) 陕西省重点研发计划项目(2023-YBGY-222)。
关键词 空洞卷积 上下文特征 双重注意力机制 密度图 人群计数 dilated convolution contextual feature dual attention mechanism density map crowd counting
  • 相关文献

参考文献2

二级参考文献18

  • 1Li M, Zhang Z X, Huang K Q. Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection[ C ]// Proceedings of the 19th Interna- tional Conference on Pattern Recognition. Florida ,USA: IEEE, 2008 1-4.
  • 2Wu B, Nevatia R. Detection of multiple, partially occluded hu- mans in a single image by bayesian combination of edgelet part detectors[ C ]// Proceedings of the 10th IEEE International Con- ference on Computer Vision. Beijing, China: IEEE, 2005:90- 97.
  • 3Zhao T, Nevatia R, Wu B. Segmentation and tracking of multi- ple humans in crowded environments [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(7) :1198- 1211.
  • 4Choudri S, Ferryman J M, Badii A. Robust background model for pixel based people counting using a single unealibrated camera [ CI//Proceedings of the 12th IEEE International Workshop on Performance Evaluation of Tracking and Surveillance. Snowbird, Utah : IEEE, 2009 : 1-8.
  • 5Hou Y L, Pang G K. PeopLe counting and human detection in a challenging situation[J]. IEEE Transactions on Systems Man and Cybernetics, 2011, 41 ( 1 ) :24-33.
  • 6Celik I-I, Hanjalic A, Flendriks E A. Towards a robust solution to people counting[ C] // Proceedings of IEEE International Con- ference on hnage Processing. Atlanta, USA : IEEE, 2006 : 2401- 2404.
  • 7Conte D, Foggia P, Percannella G. A method for counting people in crowded scenes[ C]//Proceedings of the Seventh IEEE Inter- national Conference on Advanced Video and Signal based Surveil- lance. Klagenfurt, Austria :IEEE, 2011:111-118.
  • 8Conte D, Foggia P, Percannella G. Counting moving people in videos by salient points detection [ C]// Proceedings of the 20th International Conference on Pattern Recognition. Istanbu, Turkey : IEEE, 2010 : 1743-1746.
  • 9Wu X Y, Liang G Y, Lee K K. Crowd density estimation using texture analysis and learning [ C]// Proceedings of the IEEE International Contrence on Robotics and Biomimetics. Kunming, China : 1EEE,2006:214-219.
  • 10Chan A B, Liang Z S, Vasconcelos N. Privacy preserving crowd monitoring counting people without people models or tracking[ C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Florida, USA : IEEE, 2008 : 1-7.

共引文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部