Many machine learning problems optimize an objective that must be measured with noise. The primary method is a first order stochastic gradient descent using one or more Monte Carlo (MC) samples at each step. There are settings where ill-conditioning makes second order methods such as limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) more effective. We study the use of randomized quasi-Monte Carlo (RQMC) sampling for such problems. When MC sampling has a root mean squared error (RMSE) of O(n−1/2) then RQMC has an RMSE of o(n−1/2) that can be close to O(n−3/2) in favorable settings. We prove that improved sampling accuracy translates directly to improved optimization. In our empirical investigations for variational Bayes, using RQMC with stochastic quasi-Newton method greatly speeds up the optimization, and sometimes finds a better parameter value than MC does. 
                        more » 
                        « less   
                    
                            
                            On dropping the first Sobol' point
                        
                    
    
            Quasi-Monte Carlo (QMC) points are a substitute for plain Monte Carlo (MC) points that greatly improve integration accuracy under mild assumptions on the problem. Because QMC can give errors that are o(1/n) as n → ∞, and randomized versions can attain root mean squared errors that are o(1/n), changing even one point can change the estimate by an amount much larger than the error would have been and worsen the convergence rate. As a result, certain practices that fit quite naturally and intuitively with MC points can be very detrimental to QMC performance. These include thinning, burn-in, and taking sample sizes such as powers of 10, when the QMC points were designed for different sample sizes. This article looks at the effects of a common practice in which one skips the first point of a Sobol’ sequence. The retained points ordinarily fail to be a digital net and when scrambling is applied, skipping over the first point can increase the numerical error by a factor proportional to √n where n is the number of function evaluations used. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1837931
- PAR ID:
- 10346375
- Editor(s):
- Alex Keller
- Date Published:
- Journal Name:
- 978-3-030-43465-6
- Page Range / eLocation ID:
- 71-86
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Summary Sequential Monte Carlo algorithms are widely accepted as powerful computational tools for making inference with dynamical systems. A key step in sequential Monte Carlo is resampling, which plays the role of steering the algorithm towards the future dynamics. Several strategies have been used in practice, including multinomial resampling, residual resampling, optimal resampling, stratified resampling and optimal transport resampling. In one-dimensional cases, we show that optimal transport resampling is equivalent to stratified resampling on the sorted particles, and both strategies minimize the resampling variance as well as the expected squared energy distance between the original and resampled empirical distributions. For general $$d$$-dimensional cases, we show that if the particles are first sorted using the Hilbert curve, the variance of stratified resampling is $$O(m^{-(1+2/d)})$$, an improvement over the best previously known rate of $$O(m^{-(1+1/d)})$$, where $$m$$ is the number of resampled particles. We show that this improved rate is optimal for ordered stratified resampling schemes, as conjectured in Gerber et al. (2019). We also present an almost-sure bound on the Wasserstein distance between the original and Hilbert-curve-resampled empirical distributions. In light of these results, we show that for dimension $d>1$ the mean square error of sequential quasi-Monte Carlo with $$n$$ particles can be $$O(n^{-1-4/\{d(d+4)\}})$$ if Hilbert curve resampling is used and a specific low-discrepancy set is chosen. To our knowledge, this is the first known convergence rate lower than $$o(n^{-1})$$.more » « less
- 
            Summary Maximum simulated likelihood estimation of mixed multinomial logit models requires evaluation of a multidimensional integral. Quasi-Monte Carlo (QMC) methods such as Halton sequences and modified Latin hypercube sampling are workhorse methods for integral approximation. Earlier studies explored the potential of sparse grid quadrature (SGQ), but SGQ suffers from negative weights. As an alternative to QMC and SGQ, we looked into the recently developed designed quadrature (DQ) method. DQ requires fewer nodes to get the same level of accuracy as QMC and SGQ, is as easy to implement, ensures positivity of weights, and can be created on any general polynomial space. We benchmarked DQ against QMC in a Monte Carlo and an empirical study. DQ outperformed QMC in all considered scenarios, is practice ready, and has potential to become the workhorse method for integral approximation.more » « less
- 
            Polynomial approximations for e−x and ex have applications to the design of algorithms for many problems, and our degree bounds show both the power and limitations of these algorithms. We focus in particular on the Batch Gaussian Kernel Density Estimation problem for n sample points in Θ(logn) dimensions with error δ=n−Θ(1). We show that the running time one can achieve depends on the square of the diameter of the point set, B, with a transition at B=Θ(logn) mirroring the corresponding transition in dB;δ(e−x): - When B=o(logn), we give the first algorithm running in time n1+o(1). - When B=κlogn for a small constant κ>0, we give an algorithm running in time n1+O(loglogκ−1/logκ−1). The loglogκ−1/logκ−1 term in the exponent comes from analyzing the behavior of the leading constant in our computation of dB;δ(e−x). - When B=ω(logn), we show that time n2−o(1) is necessary assuming SETH.more » « less
- 
            Li, J.; Spanos, P. D.; Chen, J. B.; Peng, Y. B. (Ed.)Infrastructure networks offer critical services to modern society. They dynamically interact with the environment, operators, and users. Infrastructure networks are unique engineered systems, large in scale and high in complexity. One fundamental issue for their reliability assessment is the uncertainty propagation from stochastic disturbances across interconnected components. Monte Carlo simulation (MCS) remains approachable to quantify stochastic dynamics from components to systems. Its application depends on time efficiency along with the capability of delivering reliable approximations. In this paper, we introduce Quasi Monte Carlo (QMC) sampling techniques to improve modeling efficiency. Also, we suggest a principled Monte Carlo (PMC) method that equips the crude MCS with Probably Approximately Correct (PAC) approaches to deliver guaranteed approximations. We compare our proposed schemes with a competitive approach for stochastic dynamic analysis, namely the Probability Density Evolution Method (PDEM). Our computational experiments are on ideal but complex enough source-terminal (S-T) dynamic network reliability problems. We endow network links with oscillators so that they can jump across safe and failed states allowing us to treat the problem from a stochastic process perspective. We find that QMC alone can yield practical accuracy, and PMC with a PAC algorithm can deliver accuracy guarantees. Also, QMC is more versatile and efficient than the PDEM for network reliability assessment. The QMC and PMC methods provide advanced uncertainty propagation techniques to support decision makers with their reliability problems.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    