Human hand detection in uncontrolled environments is a challenging visual recognition task due to numerous variations of hand poses and background image clutter.To achieve highly accurate results as well as provide re...Human hand detection in uncontrolled environments is a challenging visual recognition task due to numerous variations of hand poses and background image clutter.To achieve highly accurate results as well as provide real-time execution,we proposed a deep transfer learning approach over the state-of-the-art deep learning object detector.Our method,denoted as YOLOHANDS,is built on top of the You Only Look Once(YOLO)deep learning architecture,which is modified to adapt to the single class hand detection task.The model transfer is performed by modifying the higher convolutional layers including the last fully connected layer,while initializing lower non-modified layers with the generic pre-trained weights.To address robustness issues,we introduced a comprehensive augmentation procedure over the training image dataset,specifically adapted for the hand detection problem.Experimental evaluation of the proposed method,which is performed on a challenging public dataset,has demonstrated highly accurate results,comparable to the state-of-the-art methods.展开更多
A method is presented to convert any display screen into a touchscreen by using a pair of cameras. Most state of art touchscreens make use of special touch-sensitive hardware or depend on infrared sensors in various c...A method is presented to convert any display screen into a touchscreen by using a pair of cameras. Most state of art touchscreens make use of special touch-sensitive hardware or depend on infrared sensors in various configurations. We describe a novel computer-vision-based method that can robustly identify fingertips and detect touch with a precision of a few millimeters above the screen. In our system, the two cameras capture the display screen image simultaneously. Users can interact with a computer by the fingertip on the display screen. We have two important contributions: first, we develop a simple and robust hand detection method based on predicted images. Second, we determine whether a physical touch takes places by the homography of the two cameras. In this system, the appearance of the display screen in camera images is inherently predictable from the computer output images. Therefore, we can compute the predicted images and extract human hand precisely by simply subtracting the predicted images from captured images.展开更多
Teaching and learning-related activities have embraced digital technology especially,with the current global pandemic restrictions pervaded during the last two years.On the other hand,most of the academic and professi...Teaching and learning-related activities have embraced digital technology especially,with the current global pandemic restrictions pervaded during the last two years.On the other hand,most of the academic and professional presentations are conducted using online platforms.But the presenter-audience interaction is hindered to a certain extent in an online case in contrary to face-to-face where real-time writing is beneficial when sketching is involved.The use of digital pens and pads is a solution for such instances,though the cost of acquiring such hardware is high.Economical solutions are essential as affordability is concerned.In this study,a real-time user-friendly,innovative drawing system is developed to address the issues related to the problems confronted with online presentations.This paper presents the development of an algorithm using Hand Landmark Detection,which replaces chalks,markers and regular ballpoint pens used in conventional communication and presentation,with online platforms.The proposed application in this study is acquired by Python and OpenCV libraries.The letters or the sketches drawn in the air were taken straight to the computer screen by this algorithm.The proposed algorithm continuously identifies the Hand Landmark using the images fed by the web camera,and text or drawing patterns are displayed on the screen according to the movements made by Hand Landmark on the image space.The developed user interface is also user-friendly.Hence the communication of letters and sketches were enabled.Although the concept has been developed and tested,with further research the versatility and accuracy of communication can be enhanced.展开更多
Background Interactions with virtual 3D objects in the virtual reality(VR)environment using the gesture of fingers captured in a wearable 2D camera have emerging applications in real-life.Method This paper presents an...Background Interactions with virtual 3D objects in the virtual reality(VR)environment using the gesture of fingers captured in a wearable 2D camera have emerging applications in real-life.Method This paper presents an approach of a two-stage convolutional neural network,one for the detection of hand and another for the fingertips.One purpose of VR environments is to transform a virtual 3D object with affine parameters by using the gesture of thumb and index fingers.Results To evaluate the performance of the proposed system,one existing,and another developed egocentric fingertip databases are employed so that learning involves large variations that are common in real-life.Experimental results show that the proposed fingertip detection system outperforms the existing systems in terms of the precision of detection.Conclusion The interaction performance of the proposed system in the VR environment is higher than that of the existing systems in terms of estimation error and correlation between the ground truth and estimated affine parameters.展开更多
In this paper, a real-time system that utilizes hand gestures to interactively control the presentation is proposed. The system employs a thermal camera for robust human body segmentation to handle the complex backgro...In this paper, a real-time system that utilizes hand gestures to interactively control the presentation is proposed. The system employs a thermal camera for robust human body segmentation to handle the complex background and varying illumination posed by the projector. A fast and robust hand localization al- gorithm is proposed, with which the head, torso, and arm are sequentially localized. Hand trajectories are segmented and recognized as gestures for interactions. A dual-step calibration algorithm is utilized to map the interaction regions between the thermal camera and the projected contents by integrating a Web cam- era. Experiments show that the system has a high recognition rate for hand gestures, and corresponding interactions can be performed correctly.展开更多
基金financed by the Ministry of Education,Science and Technological Development of the Republic of Serbia.
文摘Human hand detection in uncontrolled environments is a challenging visual recognition task due to numerous variations of hand poses and background image clutter.To achieve highly accurate results as well as provide real-time execution,we proposed a deep transfer learning approach over the state-of-the-art deep learning object detector.Our method,denoted as YOLOHANDS,is built on top of the You Only Look Once(YOLO)deep learning architecture,which is modified to adapt to the single class hand detection task.The model transfer is performed by modifying the higher convolutional layers including the last fully connected layer,while initializing lower non-modified layers with the generic pre-trained weights.To address robustness issues,we introduced a comprehensive augmentation procedure over the training image dataset,specifically adapted for the hand detection problem.Experimental evaluation of the proposed method,which is performed on a challenging public dataset,has demonstrated highly accurate results,comparable to the state-of-the-art methods.
文摘A method is presented to convert any display screen into a touchscreen by using a pair of cameras. Most state of art touchscreens make use of special touch-sensitive hardware or depend on infrared sensors in various configurations. We describe a novel computer-vision-based method that can robustly identify fingertips and detect touch with a precision of a few millimeters above the screen. In our system, the two cameras capture the display screen image simultaneously. Users can interact with a computer by the fingertip on the display screen. We have two important contributions: first, we develop a simple and robust hand detection method based on predicted images. Second, we determine whether a physical touch takes places by the homography of the two cameras. In this system, the appearance of the display screen in camera images is inherently predictable from the computer output images. Therefore, we can compute the predicted images and extract human hand precisely by simply subtracting the predicted images from captured images.
文摘Teaching and learning-related activities have embraced digital technology especially,with the current global pandemic restrictions pervaded during the last two years.On the other hand,most of the academic and professional presentations are conducted using online platforms.But the presenter-audience interaction is hindered to a certain extent in an online case in contrary to face-to-face where real-time writing is beneficial when sketching is involved.The use of digital pens and pads is a solution for such instances,though the cost of acquiring such hardware is high.Economical solutions are essential as affordability is concerned.In this study,a real-time user-friendly,innovative drawing system is developed to address the issues related to the problems confronted with online presentations.This paper presents the development of an algorithm using Hand Landmark Detection,which replaces chalks,markers and regular ballpoint pens used in conventional communication and presentation,with online platforms.The proposed application in this study is acquired by Python and OpenCV libraries.The letters or the sketches drawn in the air were taken straight to the computer screen by this algorithm.The proposed algorithm continuously identifies the Hand Landmark using the images fed by the web camera,and text or drawing patterns are displayed on the screen according to the movements made by Hand Landmark on the image space.The developed user interface is also user-friendly.Hence the communication of letters and sketches were enabled.Although the concept has been developed and tested,with further research the versatility and accuracy of communication can be enhanced.
文摘Background Interactions with virtual 3D objects in the virtual reality(VR)environment using the gesture of fingers captured in a wearable 2D camera have emerging applications in real-life.Method This paper presents an approach of a two-stage convolutional neural network,one for the detection of hand and another for the fingertips.One purpose of VR environments is to transform a virtual 3D object with affine parameters by using the gesture of thumb and index fingers.Results To evaluate the performance of the proposed system,one existing,and another developed egocentric fingertip databases are employed so that learning involves large variations that are common in real-life.Experimental results show that the proposed fingertip detection system outperforms the existing systems in terms of the precision of detection.Conclusion The interaction performance of the proposed system in the VR environment is higher than that of the existing systems in terms of estimation error and correlation between the ground truth and estimated affine parameters.
文摘In this paper, a real-time system that utilizes hand gestures to interactively control the presentation is proposed. The system employs a thermal camera for robust human body segmentation to handle the complex background and varying illumination posed by the projector. A fast and robust hand localization al- gorithm is proposed, with which the head, torso, and arm are sequentially localized. Hand trajectories are segmented and recognized as gestures for interactions. A dual-step calibration algorithm is utilized to map the interaction regions between the thermal camera and the projected contents by integrating a Web cam- era. Experiments show that the system has a high recognition rate for hand gestures, and corresponding interactions can be performed correctly.