skip to main content


This content will become publicly available on August 1, 2024

Title: DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles
This article proposes a novel learning-based control policy with strong generalizability to new environments that enables a mobile robot to navigate autonomously through spaces filled with both static obstacles and dense crowds of pedestrians. The policy uses a unique combination of input data to generate the desired steering angle and forward velocity: a short history of lidar data, kinematic data about nearby pedestrians, and a subgoal point. The policy is trained in a reinforcement learning setting using a reward function that contains a novel term based on velocity obstacles to guide the robot to actively avoid pedestrians and move toward the goal. Through a series of 3-D simulated experiments with up to 55 pedestrians, this control policy is able to achieve a better balance between collision avoidance and speed (i.e., higher success rate and faster average speed) than state-of-the-art model-based and learning-based policies, and it also generalizes better to different crowd sizes and unseen environments. An extensive series of hardware experiments demonstrate the ability of this policy to directly work in different real-world environments with different crowd sizes with zero retraining. Furthermore, a series of simulated and hardware experiments show that the control policy also works in highly constrained static environments on a different robot platform without any additional training. Lastly, several important lessons that can be applied to other robot learning systems are summarized.  more » « less
Award ID(s):
1830419
NSF-PAR ID:
10510588
Author(s) / Creator(s):
;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE Transactions on Robotics
Volume:
39
Issue:
4
ISSN:
1552-3098
Page Range / eLocation ID:
2700 to 2719
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper proposes a novel neural network-based control policy to enable a mobile robot to navigate safety through environments filled with both static obstacles, such as tables and chairs, and dense crowds of pedestrians. The network architecture uses early fusion to combine a short history of lidar data with kinematic data about nearby pedestrians. This kinematic data is key to enable safe robot navigation in these uncontrolled, human-filled environments. The network is trained in a supervised setting, using expert demonstrations to learn safe navigation behaviors. A series of experiments in detailed simulated environments demonstrate the efficacy of this policy, which is able to achieve a higher success rate than either standard model-based planners or state-of-the-art neural network control policies that use only raw sensor data. 
    more » « less
  2. In this paper, we present a decentralized control approach based on a Nonlinear Model Predictive Control (NMPC) method that employs barrier certificates for safe navigation of multiple nonholonomic wheeled mobile robots in unknown environments with static and/or dynamic obstacles. This method incorporates a Learned Barrier Function (LBF) into the NMPC design in order to guarantee safe robot navigation, i.e., prevent robot collisions with other robots and the obstacles. We refer to our proposed control approach as NMPC-LBF. Since each robot does not have a priori knowledge about the obstacles and other robots, we use a Deep Neural Network (DeepNN) running in real-time on each robot to learn the Barrier Function (BF) only from the robot's LiDAR and odometry measurements. The DeepNN is trained to learn the BF that separates safe and unsafe regions. We implemented our proposed method on simulated and actual Turtlebot3 Burger robot(s) in different scenarios. The implementation results show the effectiveness of the NMPC-LBF method at ensuring safe navigation of the robots. 
    more » « less
  3. Intelligent multi-purpose robotic assistants have the potential to assist nurses with a variety of non-critical tasks, such as object fetching, disinfecting areas, or supporting patient care. This paper focuses on enabling a multi-purpose robot to guide patients while walking. The proposed robotic framework aims at enabling a robot to learn how to navigate a crowded hospital environment while maintaining contact with the patient. Two deep reinforcement learning models are developed; the first model considers only dynamic obstacles (e.g., humans), while the second model considers static and dynamic obstacles in the environment. The models output the robot’s velocity based on the following inputs; the patient’s gait velocity, which is computed based on a leg detection method, spatial and temporal information from the environment, the humans in the scene, and the robot. The proposed models demonstrate promising results. Finally, the model that considers both static and dynamic obstacles is successfully deployed in the Gazebo simulation environment. 
    more » « less
  4. Patterns of crowd behavior are believed to result from local interactions between pedestrians. Many studies have investigated the local rules of interaction, such as steering, avoiding, and alignment, but how pedestrians control their walking speed when following another remains unsettled. Most pedestrian models assume the physical speed and distance of others as input. The present study compares such “omniscient” models with “visual” models based on optical variables.We experimentally tested eight speed control models from the pedestrian- and car-following literature. Walking participants were asked to follow a leader (a moving pole) in a virtual environment, while the leader’s speed was perturbed during the trial. In Experiment 1, the leader’s initial distance was varied. Each model was fit to the data and compared. The results showed that visual models based on optical expansion (θ˙) had the smallest root mean square error in speed across conditions, whereas other models exhibited increased error at longer distances. In Experiment 2, the leader’s size (pole diameter) was varied. A model based on the relative rate of expansion (θ˙/θ) performed better than the expansion rate model (θ˙), because it is less sensitive to leader size. Together, the results imply that pedestrians directly control their walking speed in one-dimensional following using relative rate of expansion, rather than the distal speed and distance of the leader. 
    more » « less
  5. Pedestrian regulation can prevent crowd accidents and improve crowd safety in densely populated areas. Recent studies use mobile robots to regulate pedestrian flows for desired collective motion through the effect of passive human-robot interaction (HRI). This paper formulates a robot motion planning problem for the optimization of two merging pedestrian flows moving through a bottleneck exit. To address the challenge of feature representation of complex human motion dynamics under the effect of HRI, we propose using a deep neural network to model the mapping from the image input of pedestrian environments to the output of robot motion decisions. The robot motion planner is trained end-to-end using a deep reinforcement learning algorithm, which avoids hand-crafted feature detection and extraction, thus improving the learning capability for complex dynamic problems. Our proposed approach is validated in simulated experiments, and its performance is evaluated. The results demonstrate that the robot is able to find optimal motion decisions that maximize the pedestrian outflow in different flow conditions, and the pedestrian-accumulated outflow increases significantly compared to cases without robot regulation and with random robot motion. 
    more » « less