skip to main content

Title: A Comparative Study of Hand Gesture Recognition Devices in the Context of Game Design
Gesture recognition devices provide a new means for natural human-computer interaction. However, when selecting these devices for games, designers might find it challenging to decide which gesture recognition device will work best. In the present research, we compare three vision-based, hand gesture devices: Leap Motion, Microsoft's Kinect, and Intel's RealSense. We developed a simple hand-gesture based game to evaluate performance, cognitive demand, comfort, and player experience of using these gesture devices. We found that participants' preferred and performed much better using Leap Motion and Kinect compared to using RealSense. Leap Motion also outperformed or was equivalent to Kinect. These findings suggest that not all gesture recognition devices can be suitable for games and that designers need to make better decisions when selecting gesture recognition devices and designing gesture based games to insure the usability, accuracy, and comfort of such games.
Authors:
; ; ;
Award ID(s):
1651532 1619273
Publication Date:
NSF-PAR ID:
10174280
Journal Name:
Proceedings of the 2019 ACM International Conference on Interactive Surfaces and Spaces
Page Range or eLocation-ID:
397 - 402
Sponsoring Org:
National Science Foundation
More Like this
  1. Gesture recognition devices provide a new means for natural human-computer interaction. However, when selecting these devices to be used in games, designers might find it challenging to decide which gesture recognition device will work best. In the present research, we compare three vision-based, hand-gesture devices: Leap Motion, Microsoft’s Kinect, and Intel’s RealSense. The comparison provides game designers with an understanding of the main factors to consider when selecting these devices and how to design games that use them. We developed a simple hand-gesture-based game to evaluate performance, cognitive demand, comfort, and player experience of using these gesture devices. We found that participants preferred and performed much better using Leap Motion and Kinect compared to using RealSense. Leap Motion also outperformed or was equivalent to Kinect. These findings were supported by players’ accounts of their experiences using these gesture devices. Based on these findings, we discuss how such devices can be used by game designers and provide them with a set of design cautions that provide insights into the design of gesture-based games.
  2. Abstract
    The PoseASL dataset consists of color and depth videos collected from ASL signers at the Linguistic and Assistive Technologies Laboratory under the direction of Matt Huenerfauth, as part of a collaborative research project with researchers at the Rochester Institute of Technology, Boston University, and the University of Pennsylvania. Access: After becoming an authorized user of Databrary, please contact Matt Huenerfauth if you have difficulty accessing this volume. We have collected a new dataset consisting of color and depth videos of fluent American Sign Language signers performing sequences ASL signs and sentences. Given interest among sign-recognition and other computer-vision researchers in red-green-blue-depth (RBGD) video, we release this dataset for use by the research community. In addition to the video files, we share depth data files from a Kinect v2 sensor, as well as additional motion-tracking files produced through post-processing of this data. Organization of the Dataset: The dataset is organized into sub-folders, with codenames such as "P01" or "P16" etc. These codenames refer to specific human signers who were recorded in this dataset. Please note that there was no participant P11 nor P14; those numbers were accidentally skipped during the process of making appointments to collect video stimuli. Task: DuringMore>>
  3. Raynal, Ann M. ; Ranney, Kenneth I. (Ed.)
    Most research in technologies for the Deaf community have focused on translation using either video or wearable devices. Sensor-augmented gloves have been reported to yield higher gesture recognition rates than camera-based systems; however, they cannot capture information expressed through head and body movement. Gloves are also intrusive and inhibit users in their pursuit of normal daily life, while cameras can raise concerns over privacy and are ineffective in the dark. In contrast, RF sensors are non-contact, non-invasive and do not reveal private information even if hacked. Although RF sensors are unable to measure facial expressions or hand shapes, which would be required for complete translation, this paper aims to exploit near real-time ASL recognition using RF sensors for the design of smart Deaf spaces. In this way, we hope to enable the Deaf community to benefit from advances in technologies that could generate tangible improvements in their quality of life. More specifically, this paper investigates near real-time implementation of machine learning and deep learning architectures for the purpose of sequential ASL signing recognition. We utilize a 60 GHz RF sensor which transmits a frequency modulation continuous wave (FMWC waveform). RF sensors can acquire a unique source of information that ismore »inaccessible to optical or wearable devices: namely, a visual representation of the kinematic patterns of motion via the micro-Doppler signature. Micro-Doppler refers to frequency modulations that appear about the central Doppler shift, which are caused by rotational or vibrational motions that deviate from principle translational motion. In prior work, we showed that fractal complexity computed from RF data could be used to discriminate signing from daily activities and that RF data could reveal linguistic properties, such as coarticulation. We have also shown that machine learning can be used to discriminate with 99% accuracy the signing of native Deaf ASL users from that of copysigning (or imitation signing) by hearing individuals. Therefore, imitation signing data is not effective for directly training deep models. But, adversarial learning can be used to transform imitation signing to resemble native signing, or, alternatively, physics-aware generative models can be used to synthesize ASL micro-Doppler signatures for training deep neural networks. With such approaches, we have achieved over 90% recognition accuracy of 20 ASL signs. In natural environments, however, near real-time implementations of classification algorithms are required, as well as an ability to process data streams in a continuous and sequential fashion. In this work, we focus on extensions of our prior work towards this aim, and compare the efficacy of various approaches for embedding deep neural networks (DNNs) on platforms such as a Raspberry Pi or Jetson board. We examine methods for optimizing the size and computational complexity of DNNs for embedded micro-Doppler analysis, methods for network compression, and their resulting sequential ASL recognition performance.« less
  4. The goal of this research is to provide much needed empirical data on how the fidelity of popular hand gesture tracked based pointing metaphors versus commodity controller based input affects the efficiency and speed-accuracy tradeoff in users’ spatial selection in personal space interactions in VR. We conduct two experiments in which participants select spherical targets arranged in a circle in personal space, or near-field within their maximum arms reach distance, in VR. Both experiments required participants to select the targets with either a VR controller or with their dominant hand’s index finger, which was tracked with one of two popular contemporary tracking methods. In the first experiment, the targets are arranged in a flat circle in accordance with the ISO 9241-9 Fitts’ law standard, and the simulation selected random combinations of 3 target amplitudes and 3 target widths. Targets were placed centered around the users’ eye level, and the arrangement was placed at either 60%, 75%, or 90% depth plane of the users’ maximum arm’s reach. In experiment 2, the targets varied in depth randomly from one depth plane to another within the same configuration of 13 targets within a trial set, which resembled button selection task in hierarchical menusmore »in differing depth planes in the near-field. The study was conducted using the HTC Vive head-mounted display, and used either a VR controller (HTC Vive), low-fidelity virtual pointing (Leap Motion), or a high-fidelity virtual pointing (tracked VR glove) conditions. Our results revealed that low-fidelity pointing performed worse than both high-fidelity pointing and the VR controller. Overall, target selection performance was found to be worse in depth planes closer to the maximum arms reach, as compared to middle and nearer distances.« less
  5. User authentication plays an important role in securing systems and devices by preventing unauthorized accesses. Although surface Electromyogram (sEMG) has been widely applied for human machine interface (HMI) applications, it has only seen a very limited use for user authentication. In this paper, we investigate the use of multi-channel sEMG signals of hand gestures for user authentication. We propose a new deep anomaly detection-based user authentication method which employs sEMG images generated from multi-channel sEMG signals. The deep anomaly detection model classifies the user performing the hand gesture as client or imposter by using sEMG images as the input. Different sEMG image generation methods are studied in this paper. The performance of the proposed method is evaluated with a high-density hand gesture sEMG (HD-sEMG) dataset and a sparse-density hand gesture sEMG (SD-sEMG) dataset under three authentication test scenarios. Among the sEMG image generation methods, root mean square (RMS) map achieves significantly better performance than others. The proposed method with RMS map also greatly outperforms the reference method, especially when using SD-sEMG signals. The results demonstrate the validity of the proposed method with RMS map for user authentication.