skip to main content


Title: Design of a Real-Time Human-Robot Collaboration System Using Dynamic Gestures
Abstract

With the development of industrial automation and artificial intelligence, robotic systems are developing into an essential part of factory production, and the human-robot collaboration (HRC) becomes a new trend in the industrial field. In our previous work, ten dynamic gestures have been designed for communication between a human worker and a robot in manufacturing scenarios, and a dynamic gesture recognition model based on Convolutional Neural Networks (CNN) has been developed. Based on the model, this study aims to design and develop a new real-time HRC system based on multi-threading method and the CNN. This system enables the real-time interaction between a human worker and a robotic arm based on dynamic gestures. Firstly, a multi-threading architecture is constructed for high-speed operation and fast response while schedule more than one task at the same time. Next, A real-time dynamic gesture recognition algorithm is developed, where a human worker’s behavior and motion are continuously monitored and captured, and motion history images (MHIs) are generated in real-time. The generation of the MHIs and their identification using the classification model are synchronously accomplished. If a designated dynamic gesture is detected, it is immediately transmitted to the robotic arm to conduct a real-time response. A Graphic User Interface (GUI) for the integration of the proposed HRC system is developed for the visualization of the real-time motion history and classification results of the gesture identification. A series of actual collaboration experiments are carried out between a human worker and a six-degree-of-freedom (6 DOF) Comau industrial robot, and the experimental results show the feasibility and robustness of the proposed system.

 
more » « less
Award ID(s):
1646162
NSF-PAR ID:
10212192
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the ASME 2020 International Mechanical Engineering Congress and Exposition (IMECE 2020)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract As artificial intelligence and industrial automation are developing, human–robot collaboration (HRC) with advanced interaction capabilities has become an increasingly significant area of research. In this paper, we design and develop a real-time, multi-model HRC system using speech and gestures. A set of 16 dynamic gestures is designed for communication from a human to an industrial robot. A data set of dynamic gestures is designed and constructed, and it will be shared with the community. A convolutional neural network is developed to recognize the dynamic gestures in real time using the motion history image and deep learning methods. An improved open-source speech recognizer is used for real-time speech recognition of the human worker. An integration strategy is proposed to integrate the gesture and speech recognition results, and a software interface is designed for system visualization. A multi-threading architecture is constructed for simultaneously operating multiple tasks, including gesture and speech data collection and recognition, data integration, robot control, and software interface operation. The various methods and algorithms are integrated to develop the HRC system, with a platform constructed to demonstrate the system performance. The experimental results validate the feasibility and effectiveness of the proposed algorithms and the HRC system. 
    more » « less
  2. Abstract

    Human-robot collaboration (HRC) is a challenging task in modern industry and gesture communication in HRC has attracted much interest. This paper proposes and demonstrates a dynamic gesture recognition system based on Motion History Image (MHI) and Convolutional Neural Networks (CNN). Firstly, ten dynamic gestures are designed for a human worker to communicate with an industrial robot. Secondly, the MHI method is adopted to extract the gesture features from video clips and generate static images of dynamic gestures as inputs to CNN. Finally, a CNN model is constructed for gesture recognition. The experimental results show very promising classification accuracy using this method.

     
    more » « less
  3. Hideki Aoyama ; Keiich Shirase (Ed.)
    An integral part of information-centric smart manufacturing is the adaptation of industrial robots to complement human workers in a collaborative manner. While advancement in sensing has enabled real-time monitoring of workspace, understanding the semantic information in the workspace, such as parts and tools, remains a challenge for seamless robot integration. The resulting lack of adaptivity to perform in a dynamic workspace have limited robots to tasks with pre-defined actions. In this paper, a machine learning-based robotic object detection and grasping method is developed to improve the adaptivity of robots. Specifically, object detection based on the concept of single-shot detection (SSD) and convolutional neural network (CNN) is investigated to recognize and localize objects in the workspace. Subsequently, the extracted information from object detection, such as the type, position, and orientation of the object, is fed into a multi-layer perceptron (MLP) to generate the desired joint angles of robotic arm for proper object grasping and handover to the human worker. Network training is guided by forward kinematics of the robotic arm in a self-supervised manner to mitigate issues such as singularity in computation. The effectiveness of the developed method is validated on an eDo robotic arm in a human-robot collaborative assembly case study. 
    more » « less
  4. Human-Robot Collaboration (HRC), which envisions a workspace in which human and robot can dynamically collaborate, has been identified as a key element in smart manufacturing. Human action recognition plays a key role in the realization of HRC as it helps identify current human action and provides the basis for future action prediction and robot planning. Despite recent development of Deep Learning (DL) that has demonstrated great potential in advancing human action recognition, one of the key issues remains as how to effectively leverage the temporal information of human motion to improve the performance of action recognition. Furthermore, large volume of training data is often difficult to obtain due to manufacturing constraints, which poses challenge for the optimization of DL models. This paper presents an integrated method based on optical flow and convolutional neural network (CNN)-based transfer learning to tackle these two issues. First, optical flow images, which encode the temporal information of human motion, are extracted and serve as the input to a two-stream CNN structure for simultaneous parsing of spatial-temporal information of human motion. Then, transfer learning is investigated to transfer the feature extraction capability of a pretrained CNN to manufacturing scenarios. Evaluation using engine block assembly confirmed the effectiveness of the developed method. 
    more » « less
  5. Human-Robot Collaboration (HRC), which enables a workspace where human and robot can dynamically and safely collaborate for improved operational efficiency, has been identified as a key element in smart manu­facturing. Human action recognition plays a key role in the realization ofHRC, as it helps identify current human action and provides the basis for future action prediction and robot planning. While Deep Learning (DL) has demonstrated great potential in advancing human action recognition, effectively leveraging the temporal in­formation of human motions to improve the accuracy and robustness of action recognition has remained as a challenge. Furthermore, it is often difficult to obtain a large volume of data for DL network training and opti­mization, due to operational constraints in a realistic manufacturing setting. This paper presents an integrated method to address these two challenges, based on the optical flow and convolutional neural network (CNN)­based transfer learning. Specifically, optical flow images, which encode the temporal information of human motion, are extracted and serve as the input to a two-stream CNN structure for simultaneous parsing of spatial­-temporal information of human motion. Subsequently, transfer learning is investigated to transfer the feature extraction capability of a pre-trained CNN to manufacturing scenarios. Evaluation using engine block assembly confirmed the effectiveness of the developed method. 
    more » « less