skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Talk the talk and walk the walk: Dialogue-driven navigation in unknown indoor environments
Prior work in natural-language-driven navigation demonstrates success in systems deployed in synthetic environments or applied to large datasets, both real and synthetic. However, there is an absence of such frameworks being deployed and rigorously tested in real environments, unknown a priori. In this paper, we present a novel framework that uses spoken dialogue with a real person to interpret a set of navigational instructions into a plan and subsequently execute that plan in a novel, unknown, indoor environment. This framework is implemented on a real robot and its performance is evaluated in 39 trials across 3 novel test-building environments. We also demonstrate that our approach outperforms three prior vision-and-language navigation methods in this same environment.  more » « less
Award ID(s):
1734938 1522954
PAR ID:
10284074
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2021 IEEE/RSJ International Conference on Intelligent Robots and Systems
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Contemporary approaches to perception, planning, estimation, and control have allowed robots to operate robustly as our remote surrogates in uncertain, unstructured environments. This progress now creates an opportunity for robots to operate not only in isolation, but also with and alongside humans in our complex environments. Realizing this opportunity requires an efficient and flexible medium through which humans can communicate with collaborative robots. Natural language provides one such medium, and through significant progress in statistical methods for natural-language understanding, robots are now able to interpret a diverse array of free-form navigation, manipulation, and mobile-manipulation commands. However, most contemporary approaches require a detailed, prior spatial-semantic map of the robot’s environment that models the space of possible referents of an utterance. Consequently, these methods fail when robots are deployed in new, previously unknown, or partially-observed environments, particularly when mental models of the environment differ between the human operator and the robot. This paper provides a comprehensive description of a novel learning framework that allows field and service robots to interpret and correctly execute natural-language instructions in a priori unknown, unstructured environments. Integral to our approach is its use of language as a “sensor”—inferring spatial, topological, and semantic information implicit in natural-language utterances and then exploiting this information to learn a distribution over a latent environment model. We incorporate this distribution in a probabilistic, language grounding model and infer a distribution over a symbolic representation of the robot’s action space, consistent with the utterance. We use imitation learning to identify a belief-space policy that reasons over the environment and behavior distributions. We evaluate our framework through a variety of different navigation and mobile-manipulation experiments involving an unmanned ground vehicle, a robotic wheelchair, and a mobile manipulator, demonstrating that the algorithm can follow natural-language instructions without prior knowledge of the environment. 
    more » « less
  2. This paper presents a novel framework for memory-based navigation for terrestrial robots, utilizing a customized multimodal large language model (MLLM) to interpret visual inputs and generate navigation commands. The system employs a Unitree GO1 robot equipped with a camera to capture environmental images, which are processed by the customized MLLM for navigation. By leveraging a memory-based approach, the robot efficiently reuses previously traversed paths, reducing the need for re-exploration and enhancing navigation efficiency. The hybrid controller in this work features a deliberation unit and a reactive controller for high-level commands and robot alignment. Experimental validation in a hallway-like environment demonstrates that memory-driven navigation improves path retracing and overall performance. 
    more » « less
  3. Mobile robot navigation is a critical aspect of robotics, with applications spanning from service robots to industrial automation. However, navigating in complex and dynamic environments poses many challenges, such as avoiding obstacles, making decisions in real-time, and adapting to new situations. Reinforcement Learning (RL) has emerged as a promising approach to enable robots to learn navigation policies from their interactions with the environment. However, application of RL methods to real-world tasks such as mobile robot navigation, and evaluating their performance under various training–testing settings has not been sufficiently researched. In this paper, we have designed an evaluation framework that investigates the RL algorithm’s generalization capability in regard to unseen scenarios in terms of learning convergence and success rates by transferring learned policies in simulation to physical environments. To achieve this, we designed a simulated environment in Gazebo for training the robot over a high number of episodes. The training environment closely mimics the typical indoor scenarios that a mobile robot can encounter, replicating real-world challenges. For evaluation, we designed physical environments with and without unforeseen indoor scenarios. This evaluation framework outputs statistical metrics, which we then use to conduct an extensive study on a deep RL method, namely the proximal policy optimization (PPO). The results provide valuable insights into the strengths and limitations of the method for mobile robot navigation. Our experiments demonstrate that the trained model from simulations can be deployed to the previously unseen physical world with a success rate of over 88%. The insights gained from our study can assist practitioners and researchers in selecting suitable RL approaches and training–testing settings for their specific robotic navigation tasks. 
    more » « less
  4. Abstract— Navigation, the ability to relocate from one place to another, is a critical skill for any individual or group. Navigating safely across unknown environments is a critical factor in determining the success of a mission. While there is an existing body of applications in the field of land navigation, they primarily rely on GPS-enabled technologies. Moreover, there is limited research on Augmented Reality (AR) as a tool for navigation in unknown environments. This research proposes to develop an AR system to provide 3-dimensional (3D) navigational insights in unfamiliar environments. This can be accomplished by generating 3D terrestrial maps leveraging Synthetic Aperture Radar (SAR) data, Google earth imagery and sparse knowledge of GPS coordinates of the region. Furthermore, the 3D terrestrial images are converted to navigational meshes to make it feasible for path-finding algorithms to work. The proposed method can be used to create an iteratively refined 3D landscape knowledge-database that can assist personnel in navigating novel environments or assist in mission planning for other operations. It can also be used to help plan/access the best strategic vantage points in the landscape. Keywords— navigation, three-dimensional, image processing, mesh, augmented reality, mixed reality, SAR, GPS 
    more » « less
  5. null (Ed.)
    From navigation in unfamiliar environments to career plan- ning, people typically first sample information before com- mitting to a plan. However, most studies find that people adopt myopic strategies when sampling information. Here we challenge those findings by investigating whether contingency planning is a driver of information sampling. To this aim, we developed a novel navigation task that is a shortest path find- ing problem under uncertainty of bridge closures. Participants (n = 109) were allowed to sample information on bridge sta- tuses prior to committing to a path. We developed a computa- tional model in which the agent samples information based on the cost of switching to a contingency plan. We find that this model fits human behavior well and is qualitatively similar to the approximated optimal solution. Together, this suggests that humans use contingency planning as a driver of information sampling. 
    more » « less