Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Free, publiclyaccessible full text available December 17, 2023

We propose a neural network approach that yields approximate solutions for highdimensional optimal control problems and demonstrate its effectiveness using examples from multiagent path finding. Our approach yields controls in a feedback form, where the policy function is given by a neural network (NN). Specifically, we fuse the HamiltonJacobiBellman (HJB) and Pontryagin Maximum Principle (PMP) approaches by parameterizing the value function with an NN. Our approach enables us to obtain approximately optimal controls in realtime without having to solve an optimization problem. Once the policy function is trained, generating a control at a given spacetime location takes milliseconds; in contrast, efficient nonlinear programming methods typically perform the same task in seconds. We train the NN offline using the objective function of the control problem and penalty terms that enforce the HJB equations. Therefore, our training algorithm does not involve data generated by another algorithm. By training on a distribution of initial states, we ensure the controls' optimality on a large portion of the statespace. Our gridfree approach scales efficiently to dimensions where grids become impractical or infeasible. We apply our approach to several multiagent collisionavoidance problems in up to 150 dimensions. Furthermore, we empirically observe that the number of parameters in our approach scales linearly with the dimension of the control problem, thereby mitigating the curse of dimensionality.more » « less

Deep neural networks (DNNs) have shown their success as highdimensional function approximators in many applications; however, training DNNs can be challenging in general. DNN training is commonly phrased as a stochastic optimization problem whose challenges include nonconvexity, nonsmoothness, insufficient regularization, and complicated data distributions. Hence, the performance of DNNs on a given task depends crucially on tuning hyperparameters, especially learning rates and regularization parameters. In the absence of theoretical guidelines or prior experience on similar tasks, this requires solving many training problems, which can be timeconsuming and demanding on computational resources. This can limit the applicability of DNNs to problems with nonstandard, complex, and scarce datasets, e.g., those arising in many scientific applications. To remedy the challenges of DNN training, we propose slimTrain, a stochastic optimization method for training DNNs with reduced sensitivity to the choice hyperparameters and fast initial convergence. The central idea of slimTrain is to exploit the separability inherent in many DNN architectures; that is, we separate the DNN into a nonlinear feature extractor followed by a linear model. This separability allows us to leverage recent advances made for solving largescale, linear, illposed inverse problems. Crucially, for the linear weights, slimTrain does not require a learning rate and automatically adapts the regularization parameter. Since our method operates on minibatches, its computational overhead per iteration is modest. In our numerical experiments, slimTrain outperforms existing DNN training methods with the recommended hyperparameter settings and reduces the sensitivity of DNN training to the remaining hyperparameters.more » « less

Two segmentation methods, one atlasbased and one neuralnetworkbased, were compared to see how well they can each automatically segment the brain stem and cerebellum in Displacement Encoding with Stimulated Echoes Magnetic Resonance Imaging (DENSEMRI) data. The segmentation is a prerequisite for estimating the average displacements in these regions, which have recently been proposed as biomarkers in the diagnosis of Chiari Malformation type I (CMI). In numerical experiments, the segmentations of both methods were similar to manual segmentations provided by trained experts. It was found that, overall, the neuralnetworkbased method alone produced more accurate segmentations than the atlasbased method did alone, but that a combination of the two methods  in which the atlasbased method is used for the segmentation of the brain stem and the neuralnetwork is used for the segmentation of the cerebellum  may be the most successful.more » « less

To analyze the abundance of multidimensional data, tensorbased frameworks have been developed. Traditionally, the matrix singular value decomposition (SVD) is used to extract the most dominant features from a matrix containing the vectorized data. While the SVD is highly useful for data that can be appropriately represented as a matrix, this step of vectorization causes us to lose the highdimensional relationships intrinsic to the data. To facilitate efficient multidimensional feature extraction, we utilize a projectionbased classification algorithm using the tSVDM, a tensor analog of the matrix SVD. Our work extends the tSVDM framework and the classification algorithm, both initially proposed for tensors of order 3, to any number of dimensions. We then apply this algorithm to a classification task using the StarPlus fMRI dataset. Our numerical experiments demonstrate that there exists a superior tensorbased approach to fMRI classification than the best possible equivalent matrixbased approach. Our results illustrate the advantages of our chosen tensor framework, provide insight into beneficial choices of parameters, and could be further developed for classification of more complex imaging data. We provide our Python implementation at https://github.com/elizabethnewman/tensorfmrimore » « less

null (Ed.)Deep generative models (DGM) are neural networks with many hidden layers trained to approximate complicated, highdimensional probability distributions using a large number of samples. When trained successfully, we can use the DGMs to estimate the likelihood of each observation and to create new samples from the underlying distribution. Developing DGMs has become one of the most hotly researched fields in artificial intelligence in recent years. The literature on DGMs has become vast and is growing rapidly. Some advances have even reached the public sphere, for example, the recent successes in generating realisticlooking images, voices, or movies; socalled deep fakes. Despite these successes, several mathematical and practical issues limit the broader use of DGMs: given a specific dataset, it remains challenging to design and train a DGM and even more challenging to find out why a particular model is or is not effective. To help advance the theoretical understanding of DGMs, we introduce DGMs and provide a concise mathematical framework for modeling the three most popular approaches: normalizing flows (NF), variational autoencoders (VAE), and generative adversarial networks (GAN). We illustrate the advantages and disadvantages of these basic approaches using numerical experiments. Our goal is to enable and motivate the reader to contribute to this proliferating research area. Our presentation also emphasizes relations between generative modeling and optimal transport.more » « less

null (Ed.)We propose a neural network approach for solving highdimensional optimal control problems. In particular, we focus on multiagent control problems with obstacle and collision avoidance. These problems immediately become highdimensional, even for moderate phasespace dimensions per agent. Our approach fuses the Pontryagin Maximum Principle and HamiltonJacobiBellman (HJB) approaches and parameterizes the value function with a neural network. Our approach yields controls in a feedback form for quick calculation and robustness to moderate disturbances to the system. We train our model using the objective function and optimality conditions of the control problem. Therefore, our training algorithm neither involves a data generation phase nor solutions from another algorithm. Our model uses empirically effective HJB penalizers for efficient training. By training on a distribution of initial states, we ensure the controls' optimality is achieved on a large portion of the statespace. Our approach is gridfree and scales efficiently to dimensions where grids become impractical or infeasible. We demonstrate our approach's effectiveness on a 150dimensional multiagent problem with obstacles.more » « less

null (Ed.)A normalizing flow is an invertible mapping between an arbitrary probability distribution and a standard normal distribution; it can be used for density estimation and statistical inference. Computing the flow follows the change of variables formula and thus requires invertibility of the mapping and an efficient way to compute the determinant of its Jacobian. To satisfy these requirements, normalizing flows typically consist of carefully chosen components. Continuous normalizing flows (CNFs) are mappings obtained by solving a neural ordinary differential equation (ODE). The neural ODE's dynamics can be chosen almost arbitrarily while ensuring invertibility. Moreover, the logdeterminant of the flow's Jacobian can be obtained by integrating the trace of the dynamics' Jacobian along the flow. Our proposed OTFlow approach tackles two critical computational challenges that limit a more widespread use of CNFs. First, OTFlow leverages optimal transport (OT) theory to regularize the CNF and enforce straight trajectories that are easier to integrate. Second, OTFlow features exact trace computation with time complexity equal to trace estimators used in existing CNFs. On five highdimensional density estimation and generative modeling tasks, OTFlow performs competitively to stateoftheart CNFs while on average requiring onefourth of the number of weights with an 8x speedup in training time and 24x speedup in inference.more » « less