Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Quadratic performance indices associated with linear plants offer simplicity and lead to linear feedback control laws, but they may not adequately capture the complexity and flexibility required to address various practical control problems. One notable example is to improve, by using possibly nonlinear laws, on the trade-off between rise time and overshoot commonly observed in classical regulator problems with linear feedback control laws. To address these issues, non-quadratic terms can be introduced into the performance index, resulting in nonlinear control laws. In this study, we tackle the challenge of solving optimal control problems with non-quadratic performance indices using the closed-loop neighboring extremal optimal control (NEOC) approach and homotopy method. Building upon the foundation of the Linear Quadratic Regulator (LQR) framework, we introduce a parameter associated with the non-quadratic terms in the cost function, which is continuously adjusted from 0 to 1. We propose an iterative algorithm based on a closed-loop NEOC framework to handle each gradual adjustment. Additionally, we discuss and analyze the classical work of Bass and Webber, whose approach involves including additional non-quadratic terms in the performance index to render the resulting Hamilton-Jacobi equation analytically solvable. Our findings are supported by numerical examples.more » « less
-
We introduce environment predictive coding, a self-supervised approach to learn environment-level representations for embodied agents. In contrast to prior work on self-supervised learning for images, we aim to jointly encode a series of images gathered by an agent as it moves about in 3D environments. We learn these representations via a zone prediction task, where we intelligently mask out portions of an agent's trajectory and predict them from the unmasked portions, conditioned on the agent's camera poses. By learning such representations on a collection of videos, we demonstrate successful transfer to multiple downstream navigation-oriented tasks. Our experiments on the photorealistic 3D environments of Gibson and Matterport3D show that our method outperforms the state-of-the-art on challenging tasks with only a limited budget of experience.more » « less
-
In reinforcement learning for visual navigation, it is common to develop a model for each new task, and train that model from scratch with task-specific interactions in 3D environments. However, this process is expensive; massive amounts of interactions are needed for the model to generalize well. Moreover, this process is repeated when-ever there is a change in the task type or the goal modality. We present a unified approach to visual navigation using a novel modular transfer learning model. Our model can effectively leverage its experience from one source task and apply it to multiple target tasks (e.g., ObjectNav, Room-Nav, Vi ewNav) with various goal modalities (e.g., image, sketch, audio, label). Furthermore, our model enables zero-shot experience learning, whereby it can solve the target tasks without receiving any task-specific interactive training. Our experiments on multiple photorealistic datasets and challenging tasks show that our approach learns faster, generalizes better, and outperforms SoTA models by a significant margin.more » « less
-
We introduce the active audio-visual source separation problem, where an agent must move intelligently in order to better isolate the sounds coming from an object of interest in its environment. The agent hears multiple audio sources simultaneously (e.g., a person speaking down the hall in a noisy household) and it must use its eyes and ears to automatically separate out the sounds originating from a target object within a limited time budget. Towards this goal, we introduce a reinforcement learning approach that trains movement policies controlling the agent’s camera and microphone placement over time, guided by the improvement in predicted audio separation quality. We demonstrate our approach in scenarios motivated by both augmented reality (system is already co-located with the target object) and mobile robotics (agent begins arbitrarily far from the target object). Using state-of-the-art realistic audio-visual simulations in 3D environments, we demonstrate our model’s ability to find minimal movement sequences with maximal payoff for audio source separation.more » « less
An official website of the United States government

Full Text Available