Crowd counting is recently becoming a hot research topic, which aims to count the number of the people in different crowded scenes. Existing methods are mainly based on training-testing pattern and rely on large data ...Crowd counting is recently becoming a hot research topic, which aims to count the number of the people in different crowded scenes. Existing methods are mainly based on training-testing pattern and rely on large data training, which fails to accurately count the crowd in real-world scenes because of the limitation of model’s generalization capability. To alleviate this issue, a scene-adaptive crowd counting method based on meta-learning with Dual-illumination Merging Network (DMNet) is proposed in this paper. The proposed method based on learning-to-learn and few-shot learning is able to adapt different scenes which only contain a few labeled images. To generate high quality density map and count the crowd in low-lighting scene, the DMNet is proposed, which contains Multi-scale Feature Extraction module and Element-wise Fusion Module. The Multi-scale Feature Extraction module is used to extract the image feature by multi-scale convolutions, which helps to improve network accuracy. The Element-wise Fusion module fuses the low-lighting feature and illumination-enhanced feature, which supplements the missing illumination in low-lighting environments. Experimental results on benchmarks, WorldExpo’10, DISCO, USCD, and Mall, show that the proposed method outperforms the existing state-of-the-art methods in accuracy and gets satisfied results.展开更多
Existing unsupervised person re-identification approaches fail to fully capture thefine-grained features of local regions,which can result in people with similar appearances and different identities being assigned the...Existing unsupervised person re-identification approaches fail to fully capture thefine-grained features of local regions,which can result in people with similar appearances and different identities being assigned the same label after clustering.The identity-independent information contained in different local regions leads to different levels of local noise.To address these challenges,joint training with local soft attention and dual cross-neighbor label smoothing(DCLS)is proposed in this study.First,the joint training is divided into global and local parts,whereby a soft attention mechanism is proposed for the local branch to accurately capture the subtle differences in local regions,which improves the ability of the re-identification model in identifying a person’s local significant features.Second,DCLS is designed to progressively mitigate label noise in different local regions.The DCLS uses global and local similarity metrics to semantically align the global and local regions of the person and further determines the proximity association between local regions through the cross information of neighboring regions,thereby achieving label smoothing of the global and local regions throughout the training process.In extensive experiments,the proposed method outperformed existing methods under unsupervised settings on several standard person re-identification datasets.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.62076117 and 61762061)the Natural Science Foundation of Jiangxi Province,China(20161ACB20004)Jiangxi Key Laboratory of Smart City(20192BCD40002).
文摘Crowd counting is recently becoming a hot research topic, which aims to count the number of the people in different crowded scenes. Existing methods are mainly based on training-testing pattern and rely on large data training, which fails to accurately count the crowd in real-world scenes because of the limitation of model’s generalization capability. To alleviate this issue, a scene-adaptive crowd counting method based on meta-learning with Dual-illumination Merging Network (DMNet) is proposed in this paper. The proposed method based on learning-to-learn and few-shot learning is able to adapt different scenes which only contain a few labeled images. To generate high quality density map and count the crowd in low-lighting scene, the DMNet is proposed, which contains Multi-scale Feature Extraction module and Element-wise Fusion Module. The Multi-scale Feature Extraction module is used to extract the image feature by multi-scale convolutions, which helps to improve network accuracy. The Element-wise Fusion module fuses the low-lighting feature and illumination-enhanced feature, which supplements the missing illumination in low-lighting environments. Experimental results on benchmarks, WorldExpo’10, DISCO, USCD, and Mall, show that the proposed method outperforms the existing state-of-the-art methods in accuracy and gets satisfied results.
基金supported by the National Natural Science Foundation of China under Grant Nos.62076117 and 62166026the Jiangxi Key Laboratory of Smart City under Grant No.20192BCD40002Jiangxi Provincial Natural Science Foundation under Grant No.20224BAB212011.
文摘Existing unsupervised person re-identification approaches fail to fully capture thefine-grained features of local regions,which can result in people with similar appearances and different identities being assigned the same label after clustering.The identity-independent information contained in different local regions leads to different levels of local noise.To address these challenges,joint training with local soft attention and dual cross-neighbor label smoothing(DCLS)is proposed in this study.First,the joint training is divided into global and local parts,whereby a soft attention mechanism is proposed for the local branch to accurately capture the subtle differences in local regions,which improves the ability of the re-identification model in identifying a person’s local significant features.Second,DCLS is designed to progressively mitigate label noise in different local regions.The DCLS uses global and local similarity metrics to semantically align the global and local regions of the person and further determines the proximity association between local regions through the cross information of neighboring regions,thereby achieving label smoothing of the global and local regions throughout the training process.In extensive experiments,the proposed method outperformed existing methods under unsupervised settings on several standard person re-identification datasets.