The classical static Schrödinger Bridge (SSB) problem, which seeks the most likely stochastic evolution between two marginal probability measures, has been studied extensively in the optimal transport and statistical physics communities, and more recently in machine learning communities in the surge of generative models. The standard approach to solve SSB is to first identify its Kantorovich dual and use Sinkhorn's algorithm to find the optimal potential functions. While the original SSB is only a strictly convex minimization problem, this approach is known to warrant linear convergence under mild assumptions. In this work, we consider a generalized SSB allowing any strictly increasing divergence functional, far generalizing the entropy functional x log(x) in the standard SSB. This problem naturally arises in a wide range of seemingly unrelated problems in entropic optimal transport, random graphs/matrices, and combinatorics. We establish Kantorovich duality and linear convergence of Sinkhorn's algorithm for the generalized SSB problem under mild conditions. Our results provide a new rigorous foundation for understanding Sinkhorn-type iterative methods in the context of large-scale generalized Schrödinger bridges. 
                        more » 
                        « less   
                    This content will become publicly available on July 30, 2026
                            
                            On Sinkhorn's Algorithm and Choice Modeling
                        
                    
    
            For a broad class of models widely used in practice for choice and ranking data based on Luce's choice axiom, including the Bradley--Terry--Luce and Plackett--Luce models, we show that the associated maximum likelihood estimation problems are equivalent to a classic matrix balancing problem with target row and column sums. This perspective opens doors between two seemingly unrelated research areas, and allows us to unify existing algorithms in the choice modeling literature as special instances or analogs of Sinkhorn's celebrated algorithm for matrix balancing. We draw inspirations from these connections and resolve some open problems on the study of Sinkhorn's algorithm. We establish the global linear convergence of Sinkhorn's algorithm for non-negative matrices whenever finite scaling matrices exist, and characterize its linear convergence rate in terms of the algebraic connectivity of a weighted bipartite graph. We further derive the sharp asymptotic rate of linear convergence, which generalizes a classic result of Knight (2008). To our knowledge, these are the first quantitative linear convergence results for Sinkhorn's algorithm for general non-negative matrices and positive marginals. Our results highlight the importance of connectivity and orthogonality structures in matrix balancing and Sinkhorn's algorithm, which could be of independent interest. More broadly, the connections we establish in this paper between matrix balancing and choice modeling could also help motivate further transmission of ideas and lead to interesting results in both disciplines. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2143176
- PAR ID:
- 10621144
- Publisher / Repository:
- INFORMS
- Date Published:
- Journal Name:
- Operations research
- ISSN:
- 1526-5463
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            We propose a unified framework to solve general low-rank plus sparse matrix recovery problems based on matrix factorization, which covers a broad family of objective functions satisfying the restricted strong convexity and smoothness conditions. Based on projected gradient descent and the double thresholding operator, our proposed generic algorithm is guaranteed to converge to the unknown low-rank and sparse matrices at a locally linear rate, while matching the best-known robustness guarantee (i.e., tolerance for sparsity). At the core of our theory is a novel structural Lipschitz gradient condition for low-rank plus sparse matrices, which is essential for proving the linear convergence rate of our algorithm, and we believe is of independent interest to prove fast rates for general superposition-structured models. We illustrate the application of our framework through two concrete examples: robust matrix sensing and robust PCA. Empirical experiments corroborate our theory.more » « less
- 
            We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces. We show that an adaptive sampling strategy for the random subspace significantly outperforms the oblivious sampling method, which is the common choice in the recent literature. The adaptive subspace can be efficiently generated by a correlated random matrix ensemble whose statistics mimic the input data. We prove that the improvement in the relative error of the solution can be tightly characterized in terms of the spectrum of the data matrix, and provide probabilistic upper-bounds. We then illustrate the consequences of our theory with data matrices of different spectral decay. Extensive experimental results show that the proposed approach offers significant speed ups in machine learning problems including logistic regression, kernel classification with random convolution layers and shallow neural networks with rectified linear units. Our analysis is based on convex analysis and Fenchel duality, and establishes connections to sketching and randomized matrix decompositions.more » « less
- 
            In this paper, we present the first explicit and non-asymptotic global convergence rates of the BFGS method when implemented with an inexact line search scheme satisfying the Armijo-Wolfe conditions. We show that BFGS achieves a global linear convergence rate of (1−1κ)t for μ-strongly convex functions with L-Lipschitz gradients, where κ=Lμ represents the condition number. Additionally, if the objective function's Hessian is Lipschitz, BFGS with the Armijo-Wolfe line search achieves a linear convergence rate that depends solely on the line search parameters, independent of the condition number. We also establish a global superlinear convergence rate of ((1t)t). These global bounds are all valid for any starting point x0 and any symmetric positive definite initial Hessian approximation matrix B0, though the choice of B0 impacts the number of iterations needed to achieve these rates. By synthesizing these results, we outline the first global complexity characterization of BFGS with the Armijo-Wolfe line search. Additionally, we clearly define a mechanism for selecting the step size to satisfy the Armijo-Wolfe conditions and characterize its overall complexity.more » « less
- 
            In this article, we propose, analyze and demonstrate a dynamic momentum method to accelerate power and inverse power iterations with minimal computational overhead. The method can be applied to real diagonalizable matrices, is provably convergent with acceleration in the symmetric case, and does not require a priori spectral knowledge. We review and extend background results on previously developed static momentum accelerations for the power iteration through the connection between the momentum accelerated iteration and the standard power iteration applied to an augmented matrix. We show that the augmented matrix is defective for the optimal parameter choice. We then present our dynamic method which updates the momentum parameter at each iteration based on the Rayleigh quotient and two previous residuals. We present convergence and stability theory for the method by considering a power‐like method consisting of multiplying an initial vector by a sequence of augmented matrices. We demonstrate the developed method on a number of benchmark problems, and see that it outperforms both the power iteration and often the static momentum acceleration with optimal parameter choice. Finally, we present and demonstrate an explicit extension of the algorithm to inverse power iterations.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
