A framework for autonomous waypoint planning, trajectory generation through waypoints, and trajectory tracking for multi-rotor unmanned aerial vehicles (UAVs) is proposed in this work. Safe and effective operations of these UAVs is a problem that demands obstacle avoidance strategies and advanced trajectory planning and control schemes for stability and energy efficiency. To address this problem, a two-level optimization strategy is used for trajectory generation, then the trajectory is tracked in a stable manner. The framework given here consists of the following components: (a) a deep reinforcement learning (DRL)-based algorithm for optimal waypoint planning while minimizing control energy and avoiding obstacles in a given environment; (b) an optimal, smooth trajectory generation algorithm through waypoints, that minimizes a combinaton of velocity, acceleration, jerk and snap; and (c) a stable tracking control law that determines a control thrust force for an UAV to track the generated trajectory.
more »
« less
Autonomous UAV with Learned Trajectory Generation and Control
Unmanned aerial vehicle (UAV) technology is a rapidly growing field with tremendous opportunities for research and applications. To achieve true autonomy for UAVs in the absence of remote control, external navigation aids like global navigation satellite systems and radar systems, a minimum energy trajectory planning that considers obstacle avoidance and stability control will be the key. Although this can be formulated as a constrained optimization problem, due to the complicated non-linear relationships between UAV trajectory and thrust control, it is almost impossible to be solved analytically. While deep reinforcement learning is known for its ability to provide model free optimization for complex system through learning, its state space, actions and reward functions must be designed carefully. This paper presents our vision of different layers of autonomy in a UAV system, and our effort in generating and tracking the trajectory both using deep reinforcement learning (DRL). The experimental results show that compared to conventional approaches, the learned trajectory will need 20% less control thrust and 18% less time to reach the target. Furthermore, using the control policy learning by DRL, the UAV will achieve 58.14% less position error and 21.77% less system power.
more »
« less
- Award ID(s):
- 1739748
- PAR ID:
- 10195593
- Date Published:
- Journal Name:
- IEEE Workshop on Signal Processing Systems (SIPS)
- Page Range / eLocation ID:
- 115 to 120
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Continuous trajectory control of fixed-wing unmanned aerial vehicles (UAVs) is complicated when considering hidden dynamics. Due to UAV multi degrees of freedom, tracking methodologies based on conventional control theory, such as Proportional-Integral-Derivative (PID) has limitations in response time and adjustment robustness, while a model based approach that calculates the force and torques based on UAV’s current status is complicated and rigid.We present an actor-critic reinforcement learning framework that controls UAV trajectory through a set of desired waypoints. A deep neural network is constructed to learn the optimal tracking policy and reinforcement learning is developed to optimize the resulting tracking scheme. The experimental results show that our proposed approach can achieve 58.14% less position error, 21.77% less system power consumption and 9:23% faster attainment than the baseline. The actor network consists of only linear operations, hence Field Programmable Gate Arrays (FPGA) based hardware acceleration can easily be designed for energy efficient real-time control.more » « less
-
null (Ed.)While Deep Reinforcement Learning has emerged as a de facto approach to many complex experience-driven networking problems, it remains challenging to deploy DRL into real systems. Due to the random exploration or half-trained deep neural networks during the online training process, the DRL agent may make unexpected decisions, which may lead to system performance degradation or even system crash. In this paper, we propose PnP-DRL, an offline-trained, plug and play DRL solution, to leverage the batch reinforcement learning approach to learn the best control policy from pre-collected transition samples without interacting with the system. After being trained without interaction with systems, our Plug and Play DRL agent will start working seamlessly, without additional exploration or possible disruption of the running systems. We implement and evaluate our PnP-DRL solution on a prevalent experience-driven networking problem, Dynamic Adaptive Streaming over HTTP (DASH). Extensive experimental results manifest that 1) The existing batch reinforcement learning method has its limits; 2) Our approach PnP-DRL significantly outperforms classical adaptive bitrate algorithms in average user Quality of Experience (QoE); 3) PnP-DRL, unlike the state-of-the-art online DRL methods, can be off and running without learning gaps, while achieving comparable performances.more » « less
-
Deep reinforcement learning (DRL) has demonstrated impressive success in solving complex control tasks by synthesizing control policies from data. However, the safety and stability of applying DRL to safety-critical systems remain a primary concern and challenging problem. To address the problem, we propose the Phy-DRL: a novel physics-model regulated deep reinforcement learning framework. The Phy-DRL is novel in two architectural designs: a physics-model-regulated reward and residual control, which integrates physics-model-based control and data-driven control. The concurrent designs enable the Phy-DRL to mathematically provable safety and stability guarantees. Finally, the effectiveness of the Phy-DRL is validated by an inverted pendulum system. Additionally, the experimental results demonstrate that the Phy-DRL features remarkably accelerated training and enlarged reward.more » « less
-
Bogomolov, S.; Parker, D. (Ed.)Safety is a critical concern for the next generation of autonomy that is likely to rely heavily on deep neural networks for perception and control. This paper proposes a method to repair unsafe ReLU DNNs in safety-critical systems using reachability analysis. Our repair method uses reachability analysis to calculate the unsafe reachable domain of a DNN, and then uses a novel loss function to construct its distance to the safe domain during the retraining process. Since subtle changes of the DNN parameters can cause unexpected performance degradation, we also present a minimal repair approach where the DNN deviation is minimized. Furthermore, we explore applications of our method to repair DNN agents in deep reinforcement learning (DRL) with seamless integration with learning algorithms. Our method is evaluated on the ACAS Xu benchmark and a rocket lander system against the state-of-the-art method ART. Experimental results show that our repair approach can generate provably safe DNNs on multiple safety specifications with negligible performance degradation, even in the absence of training data (Code is available online at https://github.com/Shaddadi/veritex.git).more » « less
An official website of the United States government

