skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.

Search for: All records

Creators/Authors contains: "Cheng, J."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Introduction

    Predictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis.


    This is a retrospective cohort study from a SafetyNet hospital’s electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound.


    We developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved an average AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG.


    Machine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary.

    more » « less
    Free, publicly-accessible full text available January 30, 2025
  2. Networked systems that occur in various domains, such as electric networks, the brain, and opinion networks, are known to obey conservation laws. For instance, electric networks obey Kirchoff’s laws, and social networks obey opinion consensus. Conservation laws are often modeled as balance equations that relate appropriate injected flows and potentials at the nodes of the networks. A recent line of work considers the problem of estimating the unknown structure of such networked systems from observations of node potentials (and only the knowledge of the statistics of injected flows). Given the dynamic nature of the systems under consideration, an equally important task is estimating the change in the structure of the network from data – the so called differential network analysis problem. That is, given two sets of node potential observations, the goal is to estimate the structural differences between the underlying networks. We formulate this novel differential network analysis problem for systems obeying conservation laws and devise a convex estimator to learn the edge changes directly from node potentials. We derive conditions under which the estimate is unique in the high-dimensional regime and devise an efficient ADMM-based approach to perform the estimation. Finally, we demonstrate the performance of our approach on synthetic and benchmark power network data. 
    more » « less
  3. Laser Induced Deep Etching (LIDE®), developed by LPKF, is a maskless laser processing method capable of patterning glass microstructures similar to microfluidics created by PDMS soft lithography. Here, we demonstrate a self-digitized droplet microfluidics chip with high aspect-ratio features and fine resolution via the LIDE® technology. LIDE® provides the means to translate microfluidic designs into glass in a process suitable for low-cost and high-volume manufacturing. 
    more » « less
  4. Local differential privacy (LDP) can be adopted to anonymize richer user data attributes that will be input to sophisticated machine learning (ML) tasks. However, today’s LDP approaches are largely task-agnostic and often lead to severe performance loss – they simply inject noise to all data attributes according to a given privacy budget, regardless of what features are most relevant for the ultimate task. In this paper, we address how to significantly improve the ultimate task performance with multi-dimensional user data by considering a task-aware privacy preservation problem. The key idea is to use an encoder-decoder framework to learn (and anonymize) a task-relevant latent representation of user data. We obtain an analytical near-optimal solution for the linear setting with mean-squared error (MSE) task loss. We also provide an approximate solution through a gradient-based learning algorithm for general nonlinear cases. Extensive experiments demonstrate that our task-aware approach significantly improves ultimate task accuracy compared to standard benchmark LDP approaches with the same level of privacy guarantee. 
    more » « less