This paper proposes a novel neural network-based control policy to enable a mobile robot to navigate safety through environments filled with both static obstacles, such as tables and chairs, and dense crowds of pedestrians. The network architecture uses early fusion to combine a short history of lidar data with kinematic data about nearby pedestrians. This kinematic data is key to enable safe robot navigation in these uncontrolled, human-filled environments. The network is trained in a supervised setting, using expert demonstrations to learn safe navigation behaviors. A series of experiments in detailed simulated environments demonstrate the efficacy of this policy, which is able to achieve a higher success rate than either standard model-based planners or state-of-the-art neural network control policies that use only raw sensor data.
more »
« less
DRL-VO: Learning to Navigate Through Crowded Dynamic Scenes Using Velocity Obstacles
This article proposes a novel learning-based control policy with strong generalizability to new environments that enables a mobile robot to navigate autonomously through spaces filled with both static obstacles and dense crowds of pedestrians. The policy uses a unique combination of input data to generate the desired steering angle and forward velocity: a short history of lidar data, kinematic data about nearby pedestrians, and a subgoal point. The policy is trained in a reinforcement learning setting using a reward function that contains a novel term based on velocity obstacles to guide the robot to actively avoid pedestrians and move toward the goal. Through a series of 3-D simulated experiments with up to 55 pedestrians, this control policy is able to achieve a better balance between collision avoidance and speed (i.e., higher success rate and faster average speed) than state-of-the-art model-based and learning-based policies, and it also generalizes better to different crowd sizes and unseen environments. An extensive series of hardware experiments demonstrate the ability of this policy to directly work in different real-world environments with different crowd sizes with zero retraining. Furthermore, a series of simulated and hardware experiments show that the control policy also works in highly constrained static environments on a different robot platform without any additional training. Lastly, several important lessons that can be applied to other robot learning systems are summarized.
more »
« less
- Award ID(s):
- 1830419
- PAR ID:
- 10510588
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Transactions on Robotics
- Volume:
- 39
- Issue:
- 4
- ISSN:
- 1552-3098
- Page Range / eLocation ID:
- 2700 to 2719
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this paper, we present a decentralized control approach based on a Nonlinear Model Predictive Control (NMPC) method that employs barrier certificates for safe navigation of multiple nonholonomic wheeled mobile robots in unknown environments with static and/or dynamic obstacles. This method incorporates a Learned Barrier Function (LBF) into the NMPC design in order to guarantee safe robot navigation, i.e., prevent robot collisions with other robots and the obstacles. We refer to our proposed control approach as NMPC-LBF. Since each robot does not have a priori knowledge about the obstacles and other robots, we use a Deep Neural Network (DeepNN) running in real-time on each robot to learn the Barrier Function (BF) only from the robot's LiDAR and odometry measurements. The DeepNN is trained to learn the BF that separates safe and unsafe regions. We implemented our proposed method on simulated and actual Turtlebot3 Burger robot(s) in different scenarios. The implementation results show the effectiveness of the NMPC-LBF method at ensuring safe navigation of the robots.more » « less
-
Patterns of crowd behavior are believed to result from local interactions between pedestrians. Many studies have investigated the local rules of interaction, such as steering, avoiding, and alignment, but how pedestrians control their walking speed when following another remains unsettled. Most pedestrian models assume the physical speed and distance of others as input. The present study compares such “omniscient” models with “visual” models based on optical variables.We experimentally tested eight speed control models from the pedestrian- and car-following literature. Walking participants were asked to follow a leader (a moving pole) in a virtual environment, while the leader’s speed was perturbed during the trial. In Experiment 1, the leader’s initial distance was varied. Each model was fit to the data and compared. The results showed that visual models based on optical expansion (θ˙) had the smallest root mean square error in speed across conditions, whereas other models exhibited increased error at longer distances. In Experiment 2, the leader’s size (pole diameter) was varied. A model based on the relative rate of expansion (θ˙/θ) performed better than the expansion rate model (θ˙), because it is less sensitive to leader size. Together, the results imply that pedestrians directly control their walking speed in one-dimensional following using relative rate of expansion, rather than the distal speed and distance of the leader.more » « less
-
Intelligent multi-purpose robotic assistants have the potential to assist nurses with a variety of non-critical tasks, such as object fetching, disinfecting areas, or supporting patient care. This paper focuses on enabling a multi-purpose robot to guide patients while walking. The proposed robotic framework aims at enabling a robot to learn how to navigate a crowded hospital environment while maintaining contact with the patient. Two deep reinforcement learning models are developed; the first model considers only dynamic obstacles (e.g., humans), while the second model considers static and dynamic obstacles in the environment. The models output the robot’s velocity based on the following inputs; the patient’s gait velocity, which is computed based on a leg detection method, spatial and temporal information from the environment, the humans in the scene, and the robot. The proposed models demonstrate promising results. Finally, the model that considers both static and dynamic obstacles is successfully deployed in the Gazebo simulation environment.more » « less
-
Multi-robot cooperative control has been extensively studied using model-based distributed control methods. However, such control methods rely on sensing and perception modules in a sequential pipeline design, and the separation of perception and controls may cause processing latencies and compounding errors that affect control performance. End-to-end learning overcomes this limitation by implementing direct learning from onboard sensing data, with control commands output to the robots. Challenges exist in end-to-end learning for multi-robot cooperative control, and previous results are not scalable. We propose in this article a novel decentralized cooperative control method for multi-robot formations using deep neural networks, in which inter-robot communication is modeled by a graph neural network (GNN). Our method takes LiDAR sensor data as input, and the control policy is learned from demonstrations that are provided by an expert controller for decentralized formation control. Although it is trained with a fixed number of robots, the learned control policy is scalable. Evaluation in a robot simulator demonstrates the triangular formation behavior of multi-robot teams of different sizes under the learned control policy.more » « less
An official website of the United States government

