While mixed integer linear programming (MILP) solvers are routinely used to solve a wide range of important science and engineering problems, it remains a challenging task for end users to write correct and efficient MILP constraints, especially for problems specified using the inherently non-linear Boolean logic operations. To overcome this challenge, we propose a syntax guided synthesis (SyGuS) method capable of generating high-quality MILP constraints from the specifications expressed using arbitrary combinations of Boolean logic operations. At the center of our method is an extensible domain specification language (DSL) whose expressiveness may be improved by adding new integer variables as decision variables, together with an iterative procedure for synthesizing linear constraints from non-linear Boolean logic operations using these integer variables. To make the synthesis method efficient, we also propose an over-approximation technique for soundly proving the correctness of the synthesized linear constraints, and an under-approximation technique for safely pruning away the incorrect constraints. We have implemented and evaluated the method on a wide range of benchmark specifications from statistics, machine learning, and data science applications. The experimental results show that the method is efficient in handling these benchmarks, and the quality of the synthesized MILP constraints is close to, or higher than, that of manually-written constraints in terms of both compactness and solving time.
more »
« less
Improved estimates for the number of non-negative integer matrices with given row and column sums
The number of non-negative integer matrices with given row and column sums features in a variety of problems in mathematics and statistics but no closed-form expression for it is known, so we rely on approximations. In this paper, we describe a new such approximation, motivated by consideration of the statistics of matrices with non-integer numbers of columns. This estimate can be evaluated in time linear in the size of the matrix and returns results of accuracy as good as or better than existing linear-time approximations across a wide range of settings. We show that the estimate is asymptotically exact in the regime of sparse tables, while empirically performing at least as well as other linear-time estimates in the regime of dense tables. We also use the new estimate as the starting point for an improved numerical method for either counting or sampling matrices with given margins using sequential importance sampling. Code implementing our methods is available.
more »
« less
- Award ID(s):
- 2005899
- PAR ID:
- 10537883
- Publisher / Repository:
- The Royal Society
- Date Published:
- Journal Name:
- Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
- Volume:
- 480
- Issue:
- 2282
- ISSN:
- 1364-5021
- Page Range / eLocation ID:
- 20230470
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Let I = f1,..., fm ⊂ Q[x1,..., xn] be a zero dimensional radical ideal defined by polynomials given with exact rational coefficients. Assume that we are given approximations {z1,..., zk} ⊂ Cn for the common roots {ξ1,..., ξk} = V (I) ⊆ Cn. In this paper we show how to construct and certify the rational entries of Hermite matrices for I from the approximate roots {z1,..., zk}. When I is non-radical, we give methods to construct and certify Hermite matrices for √ I from the approximate roots. Furthermore, we use signatures of these Hermite matrices to give rational certificates of non-negativity of a given polynomial over a (possibly positive dimensional) real variety, as well as certificates that there is a real root within an ε distance from a given point z ∈ Qn.more » « less
-
Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is efficiency – given n input points, most kernel-based algorithms need to materialize the full n × n kernel matrix before performing any subsequent computation, thus incurring Ω(n^2) runtime. Breaking this quadratic barrier for various problems has therefore, been a subject of extensive research efforts. We break the quadratic barrier and obtain subquadratic time algorithms for several fundamental linear-algebraic and graph processing primitives, including approximating the top eigenvalue and eigenvector, spectral sparsification, solving lin- ear systems, local clustering, low-rank approximation, arboricity estimation and counting weighted triangles. We build on the recently developed Kernel Density Estimation framework, which (after preprocessing in time subquadratic in n) can return estimates of row/column sums of the kernel matrix. In particular, we de- velop efficient reductions from weighted vertex and weighted edge sampling on kernel graphs, simulating random walks on kernel graphs, and importance sampling on matrices to Kernel Density Estimation and show that we can generate samples from these distributions in sublinear (in the support of the distribution) time. Our reductions are the central ingredient in each of our applications and we believe they may be of independent interest. We empirically demonstrate the efficacy of our algorithms on low-rank approximation (LRA) and spectral sparsi- fication, where we observe a 9x decrease in the number of kernel evaluations over baselines for LRA and a 41x reduction in the graph size for spectral sparsification.more » « less
-
The inherent nonlinearity of the power flow equations poses significant challenges in accurately modeling power systems, particularly when employing linearized approximations. Although power flow linearizations provide computational efficiency, they can fail to fully capture nonlinear behavior across diverse operating conditions. To improve approximation accuracy, we propose conservative piecewise linear approximations (CPLA) of the power flow equations, which are designed to consistently over- or under-estimate the quantity of interest, ensuring conservative behavior in optimization. The flexibility provided by piecewise linear functions can yield improved accuracy relative to standard linear approximations. However, applying CPLA across all dimensions of the power flow equations could introduce significant computational complexity, especially for large-scale optimization problems. In this paper, we propose a strategy that selectively targets dimensions exhibiting significant nonlinearities. Using a second-order sensitivity analysis, we identify the directions where the power flow equations exhibit the most significant curvature and tailor the CPLAs to improve accuracy in these specific directions. This approach reduces the computational burden while maintaining high accuracy, making it particularly well-suited for mixed-integer programming problems involving the power flow equations.more » « less
-
We study the A-optimal design problem where we are given vectors υ1, …, υn ∊ ℝd, an integer k ≥ d, and the goal is to select a set S of k vectors that minimizes the trace of (∑i∊Svivi⊺)−1. Traditionally, the problem is an instance of optimal design of experiments in statistics [35] where each vector corresponds to a linear measurement of an unknown vector and the goal is to pick k of them that minimize the average variance of the error in the maximum likelihood estimate of the vector being measured. The problem also finds applications in sensor placement in wireless networks [22], sparse least squares regression [8], feature selection for k-means clustering [9], and matrix approximation [13, 14, 5]. In this paper, we introduce proportional volume sampling to obtain improved approximation algorithms for A-optimal design. Given a matrix, proportional volume sampling involves picking a set of columns S of size k with probability proportional to µ(S) times det(∑i∊Svivi⊺) for some measure µ. Our main result is to show the approximability of the A-optimal design problem can be reduced to approximate independence properties of the measure µ. We appeal to hardcore distributions as candidate distributions µ that allow us to obtain improved approximation algorithms for the A-optimal design. Our results include a d-approximation when k = d, an (1 + ∊)-approximation when and -approximation when repetitions of vectors are allowed in the solution. We also consider generalization of the problem for k ≤ d and obtain a k-approximation. We also show that the proportional volume sampling algorithm gives approximation algorithms for other optimal design objectives (such as D-optimal design [36] and generalized ratio objective [27]) matching or improving previous best known results. Interestingly, we show that a similar guarantee cannot be obtained for the E-optimal design problem. We also show that the A-optimal design problem is NP-hard to approximate within a fixed constant when k = d.more » « less
An official website of the United States government

