skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, September 13 until 2:00 AM ET on Saturday, September 14 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Mak, Simon"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available June 13, 2025
  2. Free, publicly-accessible full text available June 1, 2025
  3. Free, publicly-accessible full text available May 29, 2025
  4. Free, publicly-accessible full text available June 30, 2025
  5. Abstract

    Robust principal component analysis (RPCA) is a widely used method for recovering low‐rank structure from data matrices corrupted by significant and sparse outliers. These corruptions may arise from occlusions, malicious tampering, or other causes for anomalies, and the joint identification of such corruptions with low‐rank background is critical for process monitoring and diagnosis. However, existing RPCA methods and their extensions largely do not account for the underlying probabilistic distribution for the data matrices, which in many applications are known and can be highly non‐Gaussian. We thus propose a new method called RPCA for exponential family distributions (), which can perform the desired decomposition into low‐rank and sparse matrices when such a distribution falls within the exponential family. We present a novel alternating direction method of multiplier optimization algorithm for efficient decomposition, under either its natural or canonical parametrization. The effectiveness of is then demonstrated in two applications: the first for steel sheet defect detection and the second for crime activity monitoring in the Atlanta metropolitan area.

     
    more » « less
    Free, publicly-accessible full text available April 1, 2025
  6. Free, publicly-accessible full text available April 2, 2025
  7. Free, publicly-accessible full text available March 31, 2025
  8. Abstract

    Bias in causal comparisons has a correspondence with distributional imbalance of covariates between treatment groups. Weighting strategies such as inverse propensity score weighting attempt to mitigate bias by either modeling the treatment assignment mechanism or balancing specified covariate moments. This article introduces a new weighting method, called energy balancing, which instead aims to balance weighted covariate distributions. By directly targeting distributional imbalance, the proposed weighting strategy can be flexibly utilized in a wide variety of causal analyses without the need for careful model or moment specification. Our energy balancing weights (EBW) approach has several advantages over existing weighting techniques. First, it offers a model-free and robust approach for obtaining covariate balance that does not require tuning parameters, obviating the need for modeling decisions of secondary nature to the scientific question at hand. Second, since this approach is based on a genuine measure of distributional balance, it provides a means for assessing the balance induced by a given set of weights for a given dataset. We demonstrate the effectiveness of this EBW approach in a suite of simulation experiments, and in studies on the safety of right heart catheterization and on three additional studies using electronic health record data.

     
    more » « less
    Free, publicly-accessible full text available January 1, 2025
  9. Free, publicly-accessible full text available October 22, 2024