In this work, we develop a new framework for dynamic network flow problems based on optimal transport theory. We show that the dynamic multicommodity minimum-cost network flow problem can be formulated as a multimarginal optimal transport problem, where the cost function and the constraints on the marginals are associated with a graph structure. By exploiting these structures and building on recent advances in optimal transport theory, we develop an efficient method for such entropy-regularized optimal transport problems. In particular, the graph structure is utilized to efficiently compute the projections needed in the corresponding Sinkhorn iterations, and we arrive at a scheme that is both highly computationally efficient and easy to implement. To illustrate the performance of our algorithm, we compare it with a state-of-the-art linear programming (LP) solver. We achieve good approximations to the solution at least one order of magnitude faster than the LP solver. Finally, we showcase the methodology on a traffic routing problem with a large number of commodities. Funding: This work was supported by KTH Digital Futures, Knut och Alice Wallenbergs Stiftelse [Grants KAW 2018.0349, KAW 2021.0274, the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation], Vetenskapsrådet [Grant 2020-03454], and the National Science Foundation [Grants 1942523 and 2206576]. 
                        more » 
                        « less   
                    
                            
                            Development of a discontinuous Galerkin solver using Legion for heterogeneous high-performance computing architectures
                        
                    
    
            This work discusses the development, verification and performance assessment of a discontinuous Galerkin solver for the compressible Navier-Stokes equations using the Legion programming system. This is motivated by (i) the potential of this family of high-order numer- ical methods to accurately and efficiently realize scale-resolving simulations on unstructured grids and (ii) the desire to accommodate the utilization of emerging compute platforms that exhibit increased parallelism and heterogeneity. As a task-based programming model specifically designed for performance portability across distributed heterogeneous architectures, Legion represents an interesting lternative to the traditional approach of using Message Passing Interface for massively parallel computational physics solvers. Following detailed discussion of the implementation, the high-order convergence of the solver is demonstrated by a suite of canonical test cases and good strong scaling behavior is obtained. This work constitutes a first step towards a research platform that is able to be deployed and efficiently run on modern supercomputers. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1909379
- PAR ID:
- 10377206
- Date Published:
- Journal Name:
- AIAA Scitech 2021 Forum
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)Variability in the execution time of computing tasks can cause load imbalance in high-performance computing (HPC) systems. When configuring system- and application-level parameters, engineers traditionally seek configurations that will maximize the mean computational throughput. In an HPC setting, however, high-throughput configurations that do not account for performance variability could result in poor load balancing. In order to determine the effects of performance variance on computationally expensive numerical simulations, the High-Performance LINPACK solver is optimized by using multiobjective optimization to maximize the mean and minimize the standard deviation of the computational throughput on the High-Performance LINPACK benchmark. We show that specific configurations of the solver can be used to control for variability at a small sacrifice in mean throughput. We also identify configurations that result in a relatively high mean throughput, but also result in a high throughput variability.more » « less
- 
            Nonlinear optimal control problems are challenging to solve efficiently due to non-convexity. This paper introduces a trajectory optimization approach that achieves real-time performance by combining machine learning to predict optimal trajectories with refinement by quadratic optimization. First, a library of optimal trajectories is calculated offline and used to train a neural network. Online, the neural network predicts a trajectory for a novel initial state and cost function, and this prediction is further optimized by a sparse quadratic programming solver. We apply this approach to a fly-to-target movement problem for an indoor quadrotor. Experiments demonstrate that the technique calculates near-optimal trajectories in a few milliseconds, and generates agile movement that can be tracked more accurately than existing methods.more » « less
- 
            To guard against machine failures, modern internet services store multiple replicas of the same application data within and across data centers, which introduces the problem of keeping geodistributed replicas consistent with one another in the face of network partitions and unpredictable message latency. To avoid costly and conservative synchronization protocols, many real-world systems provide only weak consistency guarantees (e.g., eventual, causal, or PRAM consistency), which permit certain kinds of disagreement among replicas. There has been much recent interest in language support for specifying and verifying such consistency properties. Although these properties are usually beyond the scope of what traditional type checkers or compiler analyses can guarantee, solver-aided languages are up to the task. Inspired by systems like Liquid Haskell [43] and Rosette [42], we believe that close integration between a language and a solver is the right path to consistent-by-construction distributed applications. Unfortunately, verifying distributed consistency properties requires reasoning about transitive relations (e.g., causality or happens-before), partial orders (e.g., the lattice of replica states under a convergent merge operation), and properties relevant to message processing or API invocation (e.g., commutativity and idempotence) that cannot be easily or efficiently carried out by general-purpose SMT solvers that lack native support for this kind of reasoning. We argue that domain-specific SMT-based tools that exploit the mathematical foundations of distributed consistency would enable both more efficient verification and improved ease of use for domain experts. The principle of exploiting domain knowledge for efficiency and expressivity that has borne fruit elsewhere — such as in the development of high-performance domain-specific languages that trade off generality to gain both performance and productivity — also applies here. Languages augmented with domain-specific, consistency-aware solvers would support the rapid implementation of formally verified programming abstractions that guarantee distributed consistency. In the long run, we aim to democratize the development of such domain-specific solvers by creating a framework for domain-specific solver development that brings new theory solver implementation within the reach of programmers who are not necessarily SMT solver internals experts.more » « less
- 
            The design and analysis of parallel algorithms are both fundamental to the set of high-performance, parallel, and distributed computing skills required to use modern computing resources efficiently. In this work, we present an approach of teaching parallel computing within an undergraduate algorithms course that combines the paradigms of dynamic programming and multithreaded parallelization. We have developed a visualization tool built with the Thread Safe Graphics Library that enables interactive demonstration of parallelization techniques for two fundamental dynamic programming problems, 0/1 Knapsack and Longest Common Subsequence. We describe the implementation of the tool, the real-time animation it produces, and the results of using it in class. The tool is publicly available to be used directly or as a basis on which to build visualizations of other parallel dynamic programming algorithms.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    