skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Learning Complexity of Simulated Annealing
Simulated annealing is an effective and general means of optimization. It is in fact inspired by metallurgy, where the temperature of a material determines its behavior in thermodynamics. Likewise, in simulated annealing, the actions that the algorithm takes depend entirely on the value of a variable which captures the notion of temperature. Typically, simulated annealing starts with a high temperature, which makes the algorithm pretty unpredictable, and gradually cools the temperature down to become more stable. A key component that plays a crucial role in the performance of simulated annealing is the criteria under which the temperature changes namely, the cooling schedule. Motivated by this, we study the following question in this work: "Given enough samples to the instances of a specific class of optimization problems, can we design optimal (or approximately optimal) cooling schedules that minimize the runtime or maximize the success rate of the algorithm on average when the underlying problem is drawn uniformly at random from the same class?" We provide positive results both in terms of sample complexity and simulation complexity. For sample complexity, we show that O (m^1/2) samples suffice to find an approximately optimal cooling schedule of length m. We complement this result by giving a lower bound of Ω (m^1/3) on the sample complexity of any learning algorithm that provides an almost optimal cooling schedule. These results are general and rely on no assumption. For simulation complexity, however, we make additional assumptions to measure the success rate of an algorithm. To this end, we introduce the monotone stationary graph that models the performance of simulated annealing. Based on this model, we present polynomial time algorithms with provable guarantees for the learning problem.  more » « less
Award ID(s):
1815011 1733556
PAR ID:
10283184
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proeedings of the International Workshop on Artificial Intelligence and Statistics
Volume:
24
ISSN:
1525-531X
Page Range / eLocation ID:
1540-1548
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Markov chain Monte Carlo algorithms have important applications in counting problems and in machine learning problems, settings that involve estimating quantities that are difficult to compute exactly. How much can quantum computers speed up classical Markov chain algorithms? In this work we consider the problem of speeding up simulated annealing algorithms, where the stationary distributions of the Markov chains are Gibbs distributions at temperatures specified according to an annealing schedule. We construct a quantum algorithm that both adaptively constructs an annealing schedule and quantum samples at each temperature. Our adaptive annealing schedule roughly matches the length of the best classical adaptive annealing schedules and improves on nonadaptive temperature schedules by roughly a quadratic factor. Our dependence on the Markov chain gap matches other quantum algorithms and is quadratically better than what classical Markov chains achieve. Our algorithm is the first to combine both of these quadratic improvements. Like other quantum walk algorithms, it also improves on classical algorithms by producing “qsamples” instead of classical samples. This means preparing quantum states whose amplitudes are the square roots of the target probability distribution. In constructing the annealing schedule we make use of amplitude estimation, and we introduce a method for making amplitude estimation nondestructive at almost no additional cost, a result that may have independent interest. Finally we demonstrate how this quantum simulated annealing algorithm can be applied to the problems of estimating partition functions and Bayesian inference. 
    more » « less
  2. Abstract Quantum annealing (QA) is a continuous-time heuristic quantum algorithm for solving or approximately solving classical optimization problems. The algorithm uses a schedule to interpolate between a driver Hamiltonian with an easy-to-prepare ground state and a problem Hamiltonian whose ground state encodes solutions to an optimization problem. The standard implementation relies on the evolution being adiabatic: keeping the system in the instantaneous ground state with high probability and requiring a time scale inversely related to the minimum energy gap between the instantaneous ground and excited states. However, adiabatic evolution can lead to evolution times that scale exponentially with the system size, even for computationally simple problems. Here, we study whether non-adiabatic evolutions with optimized annealing schedules can bypass this exponential slowdown for one such class of problems called the frustrated ring model. For sufficiently optimized annealing schedules and system sizes of up to 39 qubits, we provide numerical evidence that we can avoid the exponential slowdown. Our work highlights the potential of highly-controllable QA to circumvent bottlenecks associated with the standard implementation of QA. 
    more » « less
  3. Abstract Neuromorphic hardware implementation of Boltzmann Machine using a network of stochastic neurons can allow non-deterministic polynomial-time (NP) hard combinatorial optimization problems to be efficiently solved. Efficient implementation of such Boltzmann Machine with simulated annealing desires the statistical parameters of the stochastic neurons to be dynamically tunable, however, there has been limited research on stochastic semiconductor devices with controllable statistical distributions. Here, we demonstrate a reconfigurable tin oxide (SnO x )/molybdenum disulfide (MoS 2 ) heterogeneous memristive device that can realize tunable stochastic dynamics in its output sampling characteristics. The device can sample exponential-class sigmoidal distributions analogous to the Fermi-Dirac distribution of physical systems with quantitatively defined tunable “temperature” effect. A BM composed of these tunable stochastic neuron devices, which can enable simulated annealing with designed “cooling” strategies, is conducted to solve the MAX-SAT, a representative in NP-hard combinatorial optimization problems. Quantitative insights into the effect of different “cooling” strategies on improving the BM optimization process efficiency are also provided. 
    more » « less
  4. Belkin, M.; Kpotufe, S. (Ed.)
    We study the problem of robust learning under clean-label data-poisoning attacks, where the at-tacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test instances at test time. The learning goal is to minimize the attackable rate (the probability mass of attackable test instances), which is more difficult than optimal PAC learning. As we show, any robust algorithm with diminishing attackable rate can achieve the optimal dependence on ε in its PAC sample complexity, i.e., O(1/ε). On the other hand, the attackable rate might be large even for some optimal PAC learners, e.g., SVM for linear classifiers. Furthermore, we show that the class of linear hypotheses is not robustly learnable when the data distribution has zero margin and is robustly learnable in the case of positive margin but requires sample complexity exponential in the dimension. For a general hypothesis class with bounded VC dimension, if the attacker is limited to add at most t >0 poison examples, the optimal robust learning sample complexity grows almost linearly with t. 
    more » « less
  5. We study the convergence rate of a continuous-time simulated annealing process \begin{document}$$ (X_t; \, t \ge 0) $$\end{document} for approximating the global optimum of a given function \begin{document}$ f $$\end{document}. We prove that the tail probability \begin{document}$$ \mathbb{P}(f(X_t) > \min f +\delta) $$\end{document}$ decays polynomial in time with an appropriately chosen cooling schedule of temperature, and provide an explicit convergence rate through a non-asymptotic bound. Our argument applies recent development of the Eyring-Kramers law on functional inequalities for the Gibbs measure at low temperatures. 
    more » « less