This paper presents a large gathering dataset of images extracted from publicly filmed videos by 24 cameras installed on the premises of Masjid Al-Nabvi,Madinah,Saudi Arabia.This dataset consists of raw and processed ...This paper presents a large gathering dataset of images extracted from publicly filmed videos by 24 cameras installed on the premises of Masjid Al-Nabvi,Madinah,Saudi Arabia.This dataset consists of raw and processed images reflecting a highly challenging and unconstraint environment.The methodology for building the dataset consists of four core phases;that include acquisition of videos,extraction of frames,localization of face regions,and cropping and resizing of detected face regions.The raw images in the dataset consist of a total of 4613 frames obtained fromvideo sequences.The processed images in the dataset consist of the face regions of 250 persons extracted from raw data images to ensure the authenticity of the presented data.The dataset further consists of 8 images corresponding to each of the 250 subjects(persons)for a total of 2000 images.It portrays a highly unconstrained and challenging environment with human faces of varying sizes and pixel quality(resolution).Since the face regions in video sequences are severely degraded due to various unavoidable factors,it can be used as a benchmark to test and evaluate face detection and recognition algorithms for research purposes.We have also gathered and displayed records of the presence of subjects who appear in presented frames;in a temporal context.This can also be used as a temporal benchmark for tracking,finding persons,activity monitoring,and crowd counting in large crowd scenarios.展开更多
基金This research was supported by the Deanship of Scientific Research,Islamic University of Madinah,Madinah(KSA),under Tammayuz program Grant Number 1442/505.
文摘This paper presents a large gathering dataset of images extracted from publicly filmed videos by 24 cameras installed on the premises of Masjid Al-Nabvi,Madinah,Saudi Arabia.This dataset consists of raw and processed images reflecting a highly challenging and unconstraint environment.The methodology for building the dataset consists of four core phases;that include acquisition of videos,extraction of frames,localization of face regions,and cropping and resizing of detected face regions.The raw images in the dataset consist of a total of 4613 frames obtained fromvideo sequences.The processed images in the dataset consist of the face regions of 250 persons extracted from raw data images to ensure the authenticity of the presented data.The dataset further consists of 8 images corresponding to each of the 250 subjects(persons)for a total of 2000 images.It portrays a highly unconstrained and challenging environment with human faces of varying sizes and pixel quality(resolution).Since the face regions in video sequences are severely degraded due to various unavoidable factors,it can be used as a benchmark to test and evaluate face detection and recognition algorithms for research purposes.We have also gathered and displayed records of the presence of subjects who appear in presented frames;in a temporal context.This can also be used as a temporal benchmark for tracking,finding persons,activity monitoring,and crowd counting in large crowd scenarios.