In this paper, we propose a machine learning-based multi-stream framework to recognize American Sign Language (ASL) manual signs and nonmanual gestures (face and head movements) in real time from RGB-D videos. Our approach is based on 3D Convolutional Neural Networks (3D CNNs) by fusing the multi-modal features including hand gestures, facial expressions, and body poses from multiple channels (RGB, Depth, Motion, and Skeleton joints). To learn the overall temporal dynamics in a video, a proxy video is generated by selecting a subset of frames for each video which are then used to train the proposed 3D CNN model. We collected a new ASL dataset, ASL-100-RGBD, which contains 42 RGB-D videos captured by a Microsoft Kinect V2 camera. Each video consists of 100 ASL manual signs, along with RGB channel, Depth maps, Skeleton joints, Face features, and HD face. The dataset is fully annotated for each semantic region (i.e. the time duration of each sign that the human signer performs). Our proposed method achieves 92.88% accuracy for recognizing 100 ASL sign glosses in our newly collected ASL-100-RGBD dataset. The effectiveness of our framework for recognizing hand gestures from RGB-D videos is further demonstrated on a large-scale dataset, ChaLearn IsoGD, achieving the state-of-the-art results. 
                        more » 
                        « less   
                    
                            
                            A Wireframe-Based Approach for Classifying and Acquiring Proficiency in the American Sign Language (Student Abstract)
                        
                    
    
            We describe our methodology for classifying ASL (American Sign Language) gestures. Rather than operate directly on raw images of hand gestures, we extract coor-dinates and render wireframes from individual images to construct a curated training dataset. This dataset is then used in a classifier that is memory efficient and provides effective performance (94% accuracy). Because we con-struct wireframes that contain information about several angles in the joints that comprise hands, our methodolo-gy is amenable to training those interested in learning ASL by identifying targeted errors in their hand gestures. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2303019
- PAR ID:
- 10537709
- Publisher / Repository:
- AAAI Press
- Date Published:
- Journal Name:
- Proceedings of the AAAI Conference on Artificial Intelligence
- Volume:
- 38
- Issue:
- 21
- ISSN:
- 2159-5399
- Page Range / eLocation ID:
- 23606 to 23607
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            User authentication is an important security mechanism to prevent unauthorized accesses to systems or devices. In this paper, we propose a new user authentication method based on surface electromyogram (sEMG) images of hand gestures and deep anomaly detection. Multi-channel sEMG signals acquired during the user performing a hand gesture are converted into sEMG images which are used as the input of a deep anomaly detection model to classify the user as client or imposter. The performance of different sEMG image generation methods in three authentication test scenarios are investigated by using a public hand gesture sEMG dataset. Our experimental results demonstrate the viability of the proposed method for user authentication.more » « less
- 
            IEEE (Ed.)Over past few years, unmanned aircraft vehicles (UAVs) have been becoming more and more popular for various purposes such as surveillance, automated industry, robotics, vehicle guidance, traffic monitoring and control system. It is very important to have multiple methods of UAVs controlling to fit in UAVs usages. The goal of this work was to develop a new technique to control an UAV by using different hand gestures. To achieve this, a hand keypoint detection algorithm was used to detect 21 keypoints in the hand. Then this keypoints were used as the input to an intelligent system based on Convolutional Neural Networks (CNN) that was able to classify the hand gestures. To capture the hand gestures, the video camera of the UAV was used. A database containing 2400 hand images was created and used to train the CNN. The database contained 8 different hand gestures that were selected to send specific motion commands to the UAV. The accuracy of the CNN to classify the hand gestures was 93%. To test the capabilities of our intelligent control system, a small UAV, the DJI Ryze Tello drone, was used. The experimental results demonstrated that the DJI Tello drone was able to be successfully controlled by hand gestures in real time.more » « less
- 
            null (Ed.)Deaf spaces are unique indoor environments designed to optimize visual communication and Deaf cultural expression. However, much of the technological research geared towards the deaf involve use of video or wearables for American sign language (ASL) translation, with little consideration for Deaf perspective on privacy and usability of the technology. In contrast to video, RF sensors offer the avenue for ambient ASL recognition while also preserving privacy for Deaf signers. Methods: This paper investigates the RF transmit waveform parameters required for effective measurement of ASL signs and their effect on word-level classification accuracy attained with transfer learning and convolutional autoencoders (CAE). A multi-frequency fusion network is proposed to exploit data from all sensors in an RF sensor network and improve the recognition accuracy of fluent ASL signing. Results: For fluent signers, CAEs yield a 20-sign classification accuracy of %76 at 77 GHz and %73 at 24 GHz, while at X-band (10 Ghz) accuracy drops to 67%. For hearing imitation signers, signs are more separable, resulting in a 96% accuracy with CAEs. Further, fluent ASL recognition accuracy is significantly increased with use of the multi-frequency fusion network, which boosts the 20-sign fluent ASL recognition accuracy to 95%, surpassing conventional feature level fusion by 12%. Implications: Signing involves finer spatiotemporal dynamics than typical hand gestures, and thus requires interrogation with a transmit waveform that has a rapid succession of pulses and high bandwidth. Millimeter wave RF frequencies also yield greater accuracy due to the increased Doppler spread of the radar backscatter. Comparative analysis of articulation dynamics also shows that imitation signing is not representative of fluent signing, and not effective in pre-training networks for fluent ASL classification. Deep neural networks employing multi-frequency fusion capture both shared, as well as sensor-specific features and thus offer significant performance gains in comparison to using a single sensor or feature-level fusion.more » « less
- 
            User authentication plays an important role in securing systems and devices by preventing unauthorized accesses. Although surface Electromyogram (sEMG) has been widely applied for human machine interface (HMI) applications, it has only seen a very limited use for user authentication. In this paper, we investigate the use of multi-channel sEMG signals of hand gestures for user authentication. We propose a new deep anomaly detection-based user authentication method which employs sEMG images generated from multi-channel sEMG signals. The deep anomaly detection model classifies the user performing the hand gesture as client or imposter by using sEMG images as the input. Different sEMG image generation methods are studied in this paper. The performance of the proposed method is evaluated with a high-density hand gesture sEMG (HD-sEMG) dataset and a sparse-density hand gesture sEMG (SD-sEMG) dataset under three authentication test scenarios. Among the sEMG image generation methods, root mean square (RMS) map achieves significantly better performance than others. The proposed method with RMS map also greatly outperforms the reference method, especially when using SD-sEMG signals. The results demonstrate the validity of the proposed method with RMS map for user authentication.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    