摘要
人群密度估计在智能安全防范领域具有重要的应用价值。针对人群密度估计在二维图像中视角变化呈现较大差异、特征空间信息丢失、尺度特征和人群特征提取困难等问题,提出了一种多特征信息融合的人群密度估计方法。该方法通过注意力机制引导的空间注意力透视(Perspective of spatial attention,PSA)方法,对图像多视角信息进行了有效信息编码,获取了特征图的空间全局上下文信息,弱化了视角变化带来的影响;而后通过多尺度信息聚合(Multi-Scale Information Aggregation,MSIA)方法,利用多尺度非对称卷积与不同膨胀率的空洞卷积进行了有效融合,获取了较为全面的图像尺度及特征信息。最终通过细致语义特征嵌入融合的方式,补充了高层特征图的空间信息及低层特征图的语义信息,并使上下文信息与尺度信息相互补充,提高了模型的准确度与鲁棒性。采用ShanghaiTech、Mall、Worldexpo’10数据集进行了实验验证,实验结果表明,所提方法的性能较其他对比方法有一定的提升。
Crowd density estimation has important application value in the field of intelligent security prevention.A crowd density estimation method with multi-feature information fusion is proposed to address the problems of large difference in viewpoint change of two-dimensional images,loss of feature spatial information,and difficulties in scale feature and crowd feature extraction.The proposed method encodes the multi-view information of images through the attention mechanism-guided perspective of spatial attention(PSA)method to obtain the spatial global contextual information of the feature map and weaken the influence of viewpoint change.Through the multi-scale information aggregation(MSIA)method,the multi-scale asymmetric convolution and the null convolution with different expansion rates are effectively integrated to obtain more comprehensive image scale and feature information.Finally,the spatial information of the high-level feature map and the semantic information of the low-level feature map are complemented by the detailed semantic feature embedding fusion,and the contextual information and scale information complement each other to improve the accuracy and robustness of the model.The experimental validation is carried out using the ShanghaiTech,Mall,and Worldexpo’10 datasets,and the experimental results show that the performance of the proposed method has been improved compared with those of other comparative methods.
作者
孟月波
陈宣润
刘光辉
徐胜军
Meng Yuebo;Chen Xuanrun;Liu Guanghui;Xu Shengjun(College of Information and Control Engineering,Xi,an University of Architecture and Technology,Xi'an,Shaanxi 710055,China;Guangdong Artificial Intelligence and Digital Economy Laboratory(Guangzhou),Guangzhou,Guangdong 510000,China)
出处
《激光与光电子学进展》
CSCD
北大核心
2021年第20期268-279,共12页
Laser & Optoelectronics Progress
基金
国家自然科学基金面上项目(51678470)
陕西省自然科学基础研究计划面上项目(2020JM-473,2020JM-472)。
关键词
图像处理
卷积神经网络
人群密度
全局上下文信息
语义嵌入
image processing
convolutional neural network
crowd density
global context information
semantic embedding