摘要
在人群计数统计时存在相机透视、人群重叠、人群遮挡等众多干扰因素,使人群计数的准确性不高。针对这一问题,提出一种多尺度融合的深度人群计数算法。首先,利用VGG-16网络的部分结构提取出人群底层特征信息;其次,以膨胀卷积理论为基础,构建多尺度特征提取模块,实现多尺度上下文特征信息的提取,降低模型参数量;最后通过将底层细节特征信息和高层语义特征信息融合的方式,提升模型计数性能和密度图质量。在三个公开数据集上对不同算法进行测试。实验结果表明,与其他人群计数算法相比,所提算法的平均绝对误差和方均误差均有不同程度的降低,说明所提算法具有较好的准确性、鲁棒性及良好的泛化性。
There are many interference factors such as camera perspective,crowd overlap,and crowd occlusion in crowd-counting statistics that decrease the accuracy of crowd counting.Aiming at addressing these problems,a population-depth counting algorithm based on multiscale fusion is proposed herein.First,the proposed algorithm uses the partial structure of the VGG-16 network to extract the underlying feature information of the crowd.Second,based on the dilated convolution theory,a multiscale feature extraction module is constructed to realize multiscale context feature information extraction and reduce the model parameter amount.Finally,the model counting performance and density-map quality are improved by fusing low-level detail feature information and high-level semantic feature information.Different algorithms are tested on three public datasets.The experimental results show that compared with other crowd counting algorithms,the average absolute error and mean square error of the proposed algorithm are reduced to varying degrees,indicating that the proposed algorithm exhibits good accuracy,robustness,and good generalization.
作者
左静
巴玉林
Zuo Jing;Ba Yulin(School of Automation and Electrical Engineering,Lanzhou.Jiaotong University,Lanzhou,Gansu 730070,China)
出处
《激光与光电子学进展》
CSCD
北大核心
2020年第24期307-315,共9页
Laser & Optoelectronics Progress
基金
国家自然科学基金(61763025,61661027)
甘肃省自然科学基金(20JR5RA398)。
关键词
机器视觉
人群计数
密度图
卷积神经网络
膨胀卷积
特征融合
machine vision
crowd counting
density map
convolutional neural network
dilated convolution
feature fusion