Abstract Neuromorphic hardware implementation of Boltzmann Machine using a network of stochastic neurons can allow non-deterministic polynomial-time (NP) hard combinatorial optimization problems to be efficiently solved. Efficient implementation of such Boltzmann Machine with simulated annealing desires the statistical parameters of the stochastic neurons to be dynamically tunable, however, there has been limited research on stochastic semiconductor devices with controllable statistical distributions. Here, we demonstrate a reconfigurable tin oxide (SnO x )/molybdenum disulfide (MoS 2 ) heterogeneous memristive device that can realize tunable stochastic dynamics in its output sampling characteristics. The device can sample exponential-class sigmoidal distributions analogous to the Fermi-Dirac distribution of physical systems with quantitatively defined tunable “temperature” effect. A BM composed of these tunable stochastic neuron devices, which can enable simulated annealing with designed “cooling” strategies, is conducted to solve the MAX-SAT, a representative in NP-hard combinatorial optimization problems. Quantitative insights into the effect of different “cooling” strategies on improving the BM optimization process efficiency are also provided.
more »
« less
Topology Optimization With Many Right-Hand Sides Using Mirror Descent Stochastic Approximation—Reduction From Many to a Single Sample
Abstract Practical engineering designs typically involve many load cases. For topology optimization with many deterministic load cases, a large number of linear systems of equations must be solved at each optimization step, leading to an enormous computational cost. To address this challenge, we propose a mirror descent stochastic approximation (MD-SA) framework with various step size strategies to solve topology optimization problems with many load cases. We reformulate the deterministic objective function and gradient into stochastic ones through randomization, derive the MD-SA update, and develop algorithmic strategies. The proposed MD-SA algorithm requires only low accuracy in the stochastic gradient and thus uses only a single sample per optimization step (i.e., the sample size is always one). As a result, we reduce the number of linear systems to solve per step from hundreds to one, which drastically reduces the total computational cost, while maintaining a similar design quality. For example, for one of the design problems, the total number of linear systems to solve and wall clock time are reduced by factors of 223 and 22, respectively.
more »
« less
- Award ID(s):
- 1720305
- PAR ID:
- 10188504
- Date Published:
- Journal Name:
- Journal of Applied Mechanics
- Volume:
- 87
- Issue:
- 5
- ISSN:
- 0021-8936
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Topology optimization problems are typically non-convex, and as such, multiple local minima exist. Depending on the initial design, the type of optimization algorithm and the optimization parameters, gradient-based optimizers converge to one of those minima. Unfortunately, these minima can be highly suboptimal, particularly when the structural response is very non-linear or when multiple constraints are present. This issue is more pronounced in the topology optimization of geometric primitives, because the design representation is more compact and restricted than in free-form topology optimization. In this paper, we investigate the use of tunneling in topology optimization to move from a poor local minimum to a better one. The tunneling method used in this work is a gradient-based deterministic method that finds a better minimum than the previous one in a sequential manner. We demonstrate this approach via numerical examples and show that the coupling of the tunneling method with topology optimization leads to better designs.more » « less
-
Bach, Francis; Blei, David; Scholkopf, Bernhard (Ed.)This paper investigates the asymptotic behaviors of gradient descent algorithms (particularly accelerated gradient descent and stochastic gradient descent) in the context of stochastic optimization arising in statistics and machine learning, where objective functions are estimated from available data. We show that these algorithms can be computationally modeled by continuous-time ordinary or stochastic differential equations. We establish gradient flow central limit theorems to describe the limiting dynamic behaviors of these computational algorithms and the large-sample performances of the related statistical procedures, as the number of algorithm iterations and data size both go to infinity, where the gradient flow central limit theorems are governed by some linear ordinary or stochastic differential equations, like time-dependent Ornstein-Uhlenbeck processes. We illustrate that our study can provide a novel unified framework for a joint computational and statistical asymptotic analysis, where the computational asymptotic analysis studies the dynamic behaviors of these algorithms with time (or the number of iterations in the algorithms), the statistical asymptotic analysis investigates the large-sample behaviors of the statistical procedures (like estimators and classifiers) that are computed by applying the algorithms; in fact, the statistical procedures are equal to the limits of the random sequences generated from these iterative algorithms, as the number of iterations goes to infinity. The joint analysis results based on the obtained gradient flow central limit theorems lead to the identification of four factors—learning rate, batch size, gradient covariance, and Hessian—to derive new theories regarding the local minima found by stochastic gradient descent for solving non-convex optimization problems.more » « less
-
Bach, Francis; Blei, David; Scholkopf, Bernhard (Ed.)This paper investigates the asymptotic behaviors of gradient descent algorithms (particularly accelerated gradient descent and stochastic gradient descent) in the context of stochastic optimization arising in statistics and machine learning, where objective functions are estimated from available data. We show that these algorithms can be computationally modeled by continuous-time ordinary or stochastic differential equations. We establish gradient flow central limit theorems to describe the limiting dynamic behaviors of these computational algorithms and the large-sample performances of the related statistical procedures, as the number of algorithm iterations and data size both go to infinity, where the gradient flow central limit theorems are governed by some linear ordinary or stochastic differential equations, like time-dependent Ornstein-Uhlenbeck processes. We illustrate that our study can provide a novel unified framework for a joint computational and statistical asymptotic analysis, where the computational asymptotic analysis studies the dynamic behaviors of these algorithms with time (or the number of iterations in the algorithms), the statistical asymptotic analysis investigates the large-sample behaviors of the statistical procedures (like estimators and classifiers) that are computed by applying the algorithms; in fact, the statistical procedures are equal to the limits of the random sequences generated from these iterative algorithms, as the number of iterations goes to infinity. The joint analysis results based on the obtained The joint analysis results based on the obtained gradient flow central limit theorems lead to the identification of four factors---learning rate, batch size, gradient covariance, and Hessian---to derive new theories regarding the local minima found by stochastic gradient descent for solving non-convex optimization problems.more » « less
-
null (Ed.)QNSTOP consists of serial and parallel (OpenMP) Fortran 2003 codes for the quasi-Newton stochastic optimization method of Castle and Trosset for stochastic search problems. A complete description of QNSTOP for both local search with stochastic objective and global search with “noisy” deterministic objective is given here, to the best of our knowledge, for the first time. For stochastic search problems, some convergence theory exists for particular algorithmic choices and parameter values. Both the parallel driver subroutine, which offers several parallel decomposition strategies, and the serial driver subroutine can be used for local stochastic search or global deterministic search, based on an input switch. Some performance data for computational systems biology problems is given.more » « less
An official website of the United States government

