skip to main content


Search for: All records

Creators/Authors contains: "Li, Jiajin"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We focus on robust estimation of the unobserved state of a discrete-time stochastic system with linear dynamics. A standard analysis of this estimation problem assumes a baseline innovation model; with Gaussian innovations we recover the Kalman filter. However, in many settings, there is insufficient or corrupted data to validate the baseline model. To cope with this problem, we minimize the worst-case mean-squared estimation error of adversarial models chosen within a Wasserstein neighborhood around the baseline. We also constrain the adversarial innovations to form a martingale difference sequence. The martingale constraint relaxes the i.i.d. assumptions which are often imposed on the baseline model. Moreover, we show that the martingale constraints guarantee that the adversarial dynamics remain adapted to the natural time-generated information. Therefore, adding the martingale constraint allows to improve upon over-conservative policies that also protect against unrealistic omniscient adversaries. We establish a strong duality result which we use to develop an efficient subgradient method to compute the distributionally robust estimation policy. If the baseline innovations are Gaussian, we show that the worst-case adversary remains Gaussian. Our numerical experiments indicate that the martingale constraint may also aid in adding a layer of robustness in the choice of the adversarial power. 
    more » « less
    Free, publicly-accessible full text available August 20, 2024
  2. Finding multiple solutions of non-convex optimization problems is a ubiquitous yet challenging task. Most past algorithms either apply single-solution optimization methods from multiple random initial guesses or search in the vicinity of found solutions using ad hoc heuristics. We present an end-to-end method to learn the proximal operator of a family of training problems so that multiple local minima can be quickly obtained from initial guesses by iterating the learned operator, emulating the proximal-point algorithm that has fast convergence. The learned proximal operator can be further generalized to recover multiple optima for unseen problems at test time, enabling applications such as object detection. The key ingredient in our formulation is a proximal regularization term, which elevates the convexity of our training loss: by applying recent theoretical results, we show that for weakly-convex objectives with Lipschitz gradients, training of the proximal operator converges globally with a practical degree of over-parameterization. We further present an exhaustive benchmark for multi-solution optimization to demonstrate the effectiveness of our method. 
    more » « less
    Free, publicly-accessible full text available May 1, 2024
  3. Finding multiple solutions of non-convex optimization problems is a ubiquitous yet challenging task. Most past algorithms either apply single-solution optimization methods from multiple random initial guesses or search in the vicinity of found solutions using ad hoc heuristics. We present an end-to-end method to learn the proximal operator of a family of training problems so that multiple local minima can be quickly obtained from initial guesses by iterating the learned operator, emulating the proximal-point algorithm that has fast convergence. The learned proximal operator can be further generalized to recover multiple optima for unseen problems at test time, enabling applications such as object detection. The key ingredient in our formulation is a proximal regularization term, which elevates the convexity of our training loss: by applying recent theoretical results, we show that for weakly-convex objectives with Lipschitz gradients, training of the proximal operator converges globally with a practical degree of over-parameterization. We further present an exhaustive benchmark for multi-solution optimization to demonstrate the effectiveness of our method. 
    more » « less
  4. We introduce a robust optimization method for flip-free distortion energies used, for example, in parametrization, deformation, and volume correspondence. This method can minimize a variety of distortion energies, such as the symmetric Dirichlet energy and our new symmetric gradient energy. We identify and exploit the special structure of distortion energies to employ an operator splitting technique, leading us to propose a novel alternating direction method of multipliers (ADMM) algorithm to deal with the nonconvex, nonsmooth nature of distortion energies. The scheme results in an efficient method where the global step involves a single matrix multiplication and the local steps are closed-form per-triangle/per-tetrahedron expressions that are highly parallelizable. The resulting general-purpose optimization algorithm exhibits robustness to flipped triangles and tetrahedra in initial data as well as during the optimization. We establish the convergence of our proposed algorithm under certain conditions and demonstrate applications to parametrization, deformation, and volume correspondence. 
    more » « less
  5. Distributionally robust optimization (DRO) has been shown to offer a principled way to regularize learning models. In this paper, we find that Tikhonov regularization is distributionally robust in an optimal transport sense (i.e. if an adversary chooses distributions in a suitable optimal transport neighborhood of the empirical measure), provided that suitable martingale constraints are also imposed. Further, we introduce a relaxation of the martingale constraints which not only provide a unified viewpoint to a class of existing robust methods but also lead to new regularization tools. To realize these novel tools, provably efficient computational algorithms are proposed. As a byproduct, the strong duality theorem proved in this paper can be potentially applied to other problems of independent interest. 
    more » « less
  6. Abstract

    Atopic dermatitis (AD) is a common skin disease in childhood whose diagnosis requires expertise in dermatology. Recent studies have indicated that host genes–microbial interactions in the gut contribute to human diseases including AD. We sought to develop an accurate and automated pipeline for AD diagnosis based on transcriptome and microbiota data. Using these data of 161 subjects including AD patients and healthy controls, we trained a machine learning classifier to predict the risk of AD. We found that the classifier could accurately differentiate subjects with AD and healthy individuals based on the omics data with an average F1-score of 0.84. With this classifier, we also identified a set of 35 genes and 50 microbiota features that are predictive for AD. Among the selected features, we discovered at least three genes and three microorganisms directly or indirectly associated with AD. Although further replications in other cohorts are needed, our findings suggest that these genes and microbiota features may provide novel biological insights and may be developed into useful biomarkers of AD prediction.

     
    more » « less
  7. Linear mixed models (LMMs) can be applied in the meta-analyses of responses from individuals across multiple contexts, increasing power to detect associations while accounting for confounding effects arising from within-individual variation. However, traditional approaches to fitting these models can be computationally intractable. Here, we describe an efficient and exact method for fitting a multiple-context linear mixed model. Whereas existing exact methods may be cubic in their time complexity with respect to the number of individuals, our approach for multiple-context LMMs (mcLMM) is linear. These improvements allow for large-scale analyses requiring computing time and memory magnitudes of order less than existing methods. As examples, we apply our approach to identify expression quantitative trait loci from large-scale gene expression data measured across multiple tissues as well as joint analyses of multiple phenotypes in genomewide association studies at biobank scale. 
    more » « less