skip to main content


Title: Automatic Generation of Robot Actions for Collaborative Tasks from Speech
Robots have the potential to assist people in daily tasks, such as cooking a meal. Communicating with the robots verbally and in an unstructured way is important, as spoken language is the main form of communication for humans. This paper proposes a novel framework that automatically generates robot actions from unstructured speech. The proposed framework was evaluated by collecting data from 15 participants preparing their meals while seating on a chair in a randomly disrupted environment. The system can identify and respond to a task sequence while the user may be engaged in unrelated conversations, even if the user’s speech might be unstructured and grammatically incorrect. The accuracy of the proposed system is 98.6%, which is a very promising finding.  more » « less
Award ID(s):
2226165
NSF-PAR ID:
10433569
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2023 9th International Conference on Automation, Robotics and Applications (ICARA)
Page Range / eLocation ID:
155 to 159
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Large-scale construction projects can benefit from having a team of heterogeneous building robots operating autonomously and cooperatively on unstructured environments. In this work, we propose a flexible system architecture, MARSala, that allows teams of distributed mobile robots to construct motion support structures in large and unstructured environments using purely local interactions. The paper primarily focuses on the deliberative layer of the architecture which provides a means for formulating a construction project as a motion support structure construction problem. We implemented the architecture in simulation and demonstrated the benefits of such a formulation in two different construction scenarios operating in large unstructured environments. 
    more » « less
  2. When a mobile robot is deployed in a field environment, e.g., during a disaster response application, the capability of adapting its navigational behaviors to unstructured terrains is essential for effective and safe robot navigation. In this paper, we introduce a novel joint terrain representation and apprenticeship learning approach to implement robot adaptation to unstructured terrains. Different from conventional learning-based adaptation techniques, our approach provides a unified problem formulation that integrates representation and apprenticeship learning under a unified regularized optimization framework, instead of treating them as separate and independent procedures. Our approach also has the capability to automatically identify discriminative feature modalities, which can improve the robustness of robot adaptation. In addition, we implement a new optimization algorithm to solve the formulated problem, which provides a theoretical guarantee to converge to the global optimal solution. In the experiments, we extensively evaluate the proposed approach in real-world scenarios, in which a mobile robot navigates on familiar and unfamiliar unstructured terrains. Experimental results have shown that the proposed approach is able to transfer human expertise to robots with small errors, achieve superior performance compared with previous and baseline methods, and provide intuitive insights on the importance of terrain feature modalities. 
    more » « less
  3. Ground robots require the crucial capability of traversing unstructured and unprepared terrains and avoiding obstacles to complete tasks in real-world robotics applications such as disaster response. When a robot operates in off-road field environments such as forests, the robot’s actual behaviors often do not match its expected or planned behaviors, due to changes in the characteristics of terrains and the robot itself. Therefore, the capability of robot adaptation for consistent behavior generation is essential for maneuverability on unstructured off-road terrains. In order to address the challenge, we propose a novel method of self-reflective terrain-aware adaptation for ground robots to generate consistent controls to navigate over unstructured off-road terrains, which enables robots to more accurately execute the expected behaviors through robot self-reflection while adapting to varying unstructured terrains. To evaluate our method’s performance, we conduct extensive experiments using real ground robots with various functionality changes over diverse unstructured off-road terrains. The comprehensive experimental results have shown that our self-reflective terrain-aware adaptation method enables ground robots to generate consistent navigational behaviors and outperforms the compared previous and baseline techniques.

     
    more » « less
  4. Despite the inherent need for enhancing human-robot interaction (HRI) by non-visually communicating robotic movements and intentions, the application of sonification (the translation of data into audible information) within the field of robotics remains underexplored. This paper investigates the problem of designing sonification algorithms that translate the motion of teams of industrial mobile robots to non-speech sounds. Our proposed solution leverages the wave space sonification (WSS) framework and utilizes localized wave fields with specific orientations within the system configuration space. This WSS-based algorithm generates sounds from the motion data of mobile robots so that the resulting audio exhibits a chosen timbre when the robots pass near designated configurations or move along desired directions. To demonstrate its versatility, the WSS-based sonification algorithm is applied to a team of OMRON LD series autonomous mobile robots, sonifying their motion patterns with pure tonal sounds. 
    more » « less
  5. Speech activity detection (SAD) serves as a crucial front-end system to several downstream Speech and Language Technology (SLT) tasks such as speaker diarization, speaker identification, and speech recognition. Recent years have seen deep learning (DL)-based SAD systems designed to improve robustness against static background noise and interfering speakers. However, SAD performance can be severely limited for conversations recorded in naturalistic environments due to dynamic acoustic scenarios and previously unseen non-speech artifacts. In this letter, we propose an end-to-end deep learning framework designed to be robust to time-varying noise profiles observed in naturalistic audio. We develop a novel SAD solution for the UTDallas Fearless Steps Apollo corpus based on NASA’s Apollo missions. The proposed system leverages spectro-temporal correlations with a threshold optimization mechanism to adjust to acoustic variabilities across multiple channels and missions. This system is trained and evaluated on the Fearless Steps Challenge (FSC) corpus (a subset of the Apollo corpus). Experimental results indicate a high degree of adaptability to out-of-domain data, achieving a relative Detection Cost Function (DCF) performance improvement of over 50% compared to the previous FSC baselines and state-of-the-art (SOTA) SAD systems. The proposed model also outperforms the most recent DL-based SOTA systems from FSC Phase-4. Ablation analysis is conducted to confirm the efficacy of the proposed spectro-temporal features. 
    more » « less