Sparse principal component analysis (SPCA) is designed to enhance the interpretability of traditional principal component analysis by optimally selecting a subset of features that comprise the first principal component. Given the NP-hard nature of SPCA, most current approaches resort to approximate solutions, typically achieved through tractable semidefinite programs or heuristic methods. To solve SPCA to optimality, we propose two exact mixed-integer semidefinite programs (MISDPs) and an arbitrarily equivalent mixed-integer linear program. The MISDPs allow us to design an effective branch-and-cut algorithm with closed-form cuts that do not need to solve dual problems. For the proposed mixed-integer formulations, we further derive the theoretical optimality gaps of their continuous relaxations. Besides, we apply the greedy and local search algorithms to solving SPCA and derive their first-known approximation ratios. Our numerical experiments reveal that the exact methods we developed can efficiently find optimal solutions for data sets containing hundreds of features. Furthermore, our approximation algorithms demonstrate both scalability and near-optimal performance when benchmarked on larger data sets, specifically those with thousands of features. History: Accepted by Andrea Lodi, Area Editor for Design & Analysis of Algorithms—Discrete. Funding: This research was supported in part by the Division of Civil, Mechanical and Manufacturing Innovation [Grant 224614], the Division of Computing and Communication Foundations [Grant 2246417], and the Office of Naval Research [Grant N00014-24-1-2066]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.0372 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2022.0372 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
more »
« less
D-Optimal Data Fusion: Exact and Approximation Algorithms
We study the D-optimal Data Fusion (DDF) problem, which aims to select new data points, given an existing Fisher information matrix, so as to maximize the logarithm of the determinant of the overall Fisher information matrix. We show that the DDF problem is NP-hard and has no constant-factor polynomial-time approximation algorithm unless P = NP. Therefore, to solve the DDF problem effectively, we propose two convex integer-programming formulations and investigate their corresponding complementary and Lagrangian-dual problems. Leveraging the concavity of the objective functions in the two proposed convex integer-programming formulations, we design an exact algorithm, aimed at solving the DDF problem to optimality. We further derive a family of submodular valid inequalities and optimality cuts, which can significantly enhance the algorithm performance. We also develop scalable randomized-sampling and local-search algorithms with provable performance guarantees. Finally, we test our algorithms using real-world data on the new phasor-measurement-units placement problem for modern power grids, considering the existing conventional sensors. Our numerical study demonstrates the efficiency of our exact algorithm and the scalability and high-quality outputs of our approximation algorithms. History: Accepted by Andrea Lodi, Area Editor for Design & Analysis of Algorithms—Discrete. Funding: Y. Li and W. Xie were supported in part by Division of Civil, Mechanical and Manufacturing Innovation [Grant 2046414] and Division of Computing and Communication Foundations [Grant 2246417]. J. Lee was supported in part by Air Force Office of Scientific Research [Grants FA9550-19-1-0175 and FA9550-22-1-0172]. M. Fampa was supported in part by Conselho Nacional de Desenvolvimento Científico e Tecnológico [Grants 305444/2019-0 and 434683/2018-3]. Supplemental Material: The e-companion is available at https://doi.org/10.1287/ijoc.2022.0235 .
more »
« less
- PAR ID:
- 10444682
- Date Published:
- Journal Name:
- INFORMS Journal on Computing
- ISSN:
- 1091-9856
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper studies a classic maximum entropy sampling problem (MESP), which aims to select the most informative principal submatrix of a prespecified size from a covariance matrix. By investigating its Lagrangian dual and primal characterization, we derive a novel convex integer program for MESP and show that its continuous relaxation yields a near-optimal solution. The results motivate us to develop a sampling algorithm and derive its approximation bound for MESP, which improves the best known bound in literature. We then provide an efficient deterministic implementation of the sampling algorithm with the same approximation bound. Besides, we investigate the widely used local search algorithm and prove its first known approximation bound for MESP. The proof techniques further inspire for us an efficient implementation of the local search algorithm. Our numerical experiments demonstrate that these approximation algorithms can efficiently solve medium-size and large-scale instances to near optimality. Finally, we extend the analyses to the A-optimal MESP, for which the objective is to minimize the trace of the inverse of the selected principal submatrix. Funding: This work was supported by the National Science Foundation Division of Information and Intelligent Systems [Grant 2246417] and Division of Civil, Mechanical and Manufacturing Innovation [Grant 2246414]. Supplemental Material: The e-companion is available at https://doi.org/10.1287/opre.2023.2488 .more » « less
-
This paper studies the distributionally robust fair transit resource allocation model (DrFRAM) under the Wasserstein ambiguity set to optimize the public transit resource allocation during a pandemic. We show that the proposed DrFRAM is highly nonconvex and nonlinear, and it is NP-hard in general. Fortunately, we show that DrFRAM can be reformulated as a mixed integer linear programming (MILP) by leveraging the equivalent representation of distributionally robust optimization and monotonicity properties, binarizing integer variables, and linearizing nonconvex terms. To improve the proposed MILP formulation, we derive stronger ones and develop valid inequalities by exploiting the model structures. Additionally, we develop scenario decomposition methods using different MILP formulations to solve the scenario subproblems and introduce a simple yet effective no one left-based approximation algorithm with a provable approximation guarantee to solve the model to near optimality. Finally, we numerically demonstrate the effectiveness of the proposed approaches and apply them to real-world data provided by the Blacksburg Transit. History: This paper has been accepted for the Transportation Science Special Issue on Emerging Topics in Transportation Science and Logistics. Funding: This work was supported by the Division of Computing and Communication Foundations [Grant 2153607] and the Division of Civil, Mechanical and Manufacturing Innovation [Grant 2046426]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/trsc.2022.1159 .more » « less
-
Nonconvex quadratically constrained programs (QCPs) are generally NP-hard and challenging problems. In this paper, we propose two novel mixed-integer linear programming (MILP) approximations for a nonconvex QCP. Our method begins by utilizing an eigenvalue-based decomposition to express the nonconvex quadratic function as the difference of two convex functions. We then introduce an additional variable to partition each nonconvex constraint into a second-order cone (SOC) constraint and the complement of an SOC constraint. We employ two polyhedral approximation approaches to approximate the SOC constraint. The complement of an SOC constraint is approximated using a combination of linear and complementarity constraints. As a result, we approximate the nonconvex QCP with two linear programs with complementarity constraints (LPCCs). More importantly, we prove that the optimal values of the LPCCs asymptotically converge to that of the original nonconvex QCP. By proving the boundedness of the LPCCs, we further reformulate the LPCCs as MILPs. We demonstrate the effectiveness of our approaches via numerical experiments by applying our proposed approximations to randomly generated instances and two application problems: the joint decision and estimation problem and the two-trust-region subproblem. The numerical results show significant advantages of our approaches in terms of solution quality and computational time compared with existing benchmark approaches. History: Accepted by Pascal Van Hentenryck, Area Editor for Computational Modeling: Methods and Analysis. Funding: K. Pan was supported in part by the Research Grants Council of Hong Kong [Grant 15503723]. J. Cheng and B. Yang were supported in part by the Office of Naval Research [Grant N00014-20-1-2154]. J. Cheng was supported in part by the National Science Foundation [Grant ECCS-2404412]. B. Yang was supported in part by the Air Force Office of Scientific Research [Grant FA9550-23-1-0508]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2024.0719 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2024.0719 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .more » « less
-
We study linear Fisher markets with satiation. In these markets, sellers have earning limits, and buyers have utility limits. Beyond applications in economics, they arise in the context of maximizing Nash social welfare when allocating indivisible items to agents. In contrast to markets with either earning or utility limits, markets with both limits have not been studied before. They turn out to have fundamentally different properties. In general, the existence of competitive equilibria is not guaranteed. We identify a natural property of markets (termed money clearing) that implies existence. We show that the set of equilibria is not always convex, answering a question posed in the literature. We design an FPTAS to compute an approximate equilibrium and prove that the problem of computing an exact equilibrium lies in the complexity class continuous local search ([Formula: see text]; i.e., the intersection of polynomial local search ([Formula: see text]) and polynomial parity arguments on directed graphs ([Formula: see text])). For a constant number of buyers or goods, we give a polynomial-time algorithm to compute an exact equilibrium. We show how (approximate) equilibria can be rounded and provide the first constant-factor approximation algorithm (with a factor of 2.404) for maximizing Nash social welfare when agents have capped linear (also known as budget-additive) valuations. Finally, we significantly improve the approximation hardness for additive valuations to [Formula: see text]. Funding: J. Garg was supported by the National Science Foundation [Grant CCF-1942321 (CAREER)]. M. Hoefer was supported by Deutsche Forschungsgemeinschaft [Grants Ho 3831/5-1, Ho 3831/6-1, and Ho 3831/7-1].more » « less
An official website of the United States government

