skip to main content


Search for: All records

Creators/Authors contains: "Fox, Emily"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Hybrid models composing mechanistic ODE- based dynamics with flexible and expressive neural network components have grown rapidly in popularity, especially in scientific domains where such ODE-based modeling offers important interpretability and validated causal grounding (e.g., for counterfactual reasoning). The incorporation of mechanistic models also provides inductive bias in standard blackbox modeling approaches, critical when learning from small datasets or partially observed, complex systems. Unfortunately, as the hybrid models become more flexible, the causal grounding provided by the mechanistic model can quickly be lost. We address this problem by leveraging another common source of domain knowledge: ranking of treatment effects for a set of interventions, even if the precise treatment effect is unknown. We encode this information in a causal loss that we combine with the standard predictive loss to arrive at a hybrid loss that biases our learning towards causally valid hybrid models. We demonstrate our ability to achieve a win-win, state-of-the-art predictive performance and causal validity, in the challenging task of modeling glucose dynamics post-exercise in individuals with type 1 diabetes. 
    more » « less
    Free, publicly-accessible full text available July 21, 2025
  2. In this paper we define and investigate the Fréchet edit distance problem. Here, given two polygonal curves $\pi$ and $\sigma$ and a threshhold value $\delta$ , we seek the minimum number of edits to $\sigma$ such that the Fréchet distance between the edited curve and $\pi$ is at most $\delta$. For the edit operations we consider three cases, namely, deletion of vertices, insertion of vertices, or both. For this basic problem we consider a number of variants. Specifically, we provide polynomial time algorithms for both discrete and continuous Fréchet edit distance variants, as well as hardness results for weak Fréchet edit distance variants. 
    more » « less
  3. Mulzer, Wolfgang ; Phillips, Jeff M (Ed.)
    We define and investigate the Fréchet edit distance problem. Given two polygonal curves π and σ and a non-negative threshhold value δ, we seek the minimum number of edits to σ such that the Fréchet distance between the edited σ and π is at most δ. For the edit operations we consider three cases, namely, deletion of vertices, insertion of vertices, or both. For this basic problem we consider a number of variants. Specifically, we provide polynomial time algorithms for both discrete and continuous Fréchet edit distance variants, as well as hardness results for weak Fréchet edit distance variants. 
    more » « less
    Free, publicly-accessible full text available June 6, 2025
  4. In this paper we introduce and formally study the problem of $k$-clustering with faulty centers. Specifically, we study the faulty versions of $k$-center, $k$-median, and $k$-means clustering, where centers have some probability of not existing, as opposed to prior work where clients had some probability of not existing. For all three problems we provide fixed parameter tractable algorithms, in the parameters $k$, $d$, and $\eps$, that $(1+\eps)$-approximate the minimum expected cost solutions for points in $d$ dimensional Euclidean space. For Faulty $k$-center we additionally provide a 5-approximation for general metrics. Significantly, all of our algorithms have only a linear dependence on $n$. 
    more » « less
    Free, publicly-accessible full text available February 1, 2025
  5. Given a set of points $P = (P^+ \sqcup P^-) \subset \mathbb{R}^d$ for some constant $d$ and a supply function $\mu:P\to \mathbb{R}$ such that $\mu(p) > 0~\forall p \in P^+$, $\mu(p) < 0~\forall p \in P^-$, and $\sum_{p\in P}{\mu(p)} = 0$, the geometric transportation problem asks one to find a transportation map $\tau: P^+\times P^-\to \mathbb{R}_{\ge 0}$ such that $\sum_{q\in P^-}{\tau(p, q)} = \mu(p)~\forall p \in P^+$, $\sum_{p\in P^+}{\tau(p, q)} = -\mu(q) \forall q \in P^-$, and the weighted sum of Euclidean distances for the pairs $\sum_{(p,q)\in P^+\times P^-}\tau(p, q)\cdot ||q-p||_2$ is minimized. We present the first deterministic algorithm that computes, in near-linear time, a transportation map whose cost is within a $(1 + \varepsilon)$ factor of optimal. More precisely, our algorithm runs in $O(n\varepsilon^{-(d+2)}\log^5{n}\log{\log{n}})$ time for any constant $\varepsilon > 0$. While a randomized $n\varepsilon^{-O(d)}\log^{O(d)}{n}$ time algorithm for this problem was discovered in the last few years, all previously known deterministic $(1 + \varepsilon)$-approximation algorithms run in~$\Omega(n^{3/2})$ time. A similar situation existed for geometric bipartite matching, the special case of geometric transportation where all supplies are unit, until a deterministic $n\varepsilon^{-O(d)}\log^{O(d)}{n}$ time $(1 + \varepsilon)$-approximation algorithm was presented at STOC 2022. Surprisingly, our result is not only a generalization of the bipartite matching one to arbitrary instances of geometric transportation, but it also reduces the running time for all previously known $(1 + \varepsilon)$-approximation algorithms, randomized or deterministic, even for geometric bipartite matching. In particular, we give the first $(1 + \varepsilon)$-approximate deterministic algorithm for geometric bipartite matching and the first $(1 + \varepsilon)$-approximate deterministic or randomized algorithm for geometric transportation with no dependence on $d$ in the exponent of the running time's polylog. As an additional application of our main ideas, we also give the first randomized near-linear $O(\varepsilon^{-2} m \log^{O(1)} n)$ time $(1 + \varepsilon)$-approximation algorithm for the uncapacitated minimum cost flow (transshipment) problem in undirected graphs with arbitrary \emph{real} edge costs. 
    more » « less
  6. Free, publicly-accessible full text available January 23, 2025
  7. Efficiently capturing the long-range patterns in sequential data sources salient to a given task -- such as classification and generative modeling -- poses a fundamental challenge. Popular approaches in the space tradeoff between the memory burden of brute-force enumeration and comparison, as in transformers, the computational burden of complicated sequential dependencies, as in recurrent neural networks, or the parameter burden of convolutional networks with many or large filters. We instead take inspiration from wavelet-based multiresolution analysis to define a new building block for sequence modeling, which we call a MultiresLayer. The key component of our model is the multiresolution convolution, capturing multiscale trends in the input sequence. Our MultiresConv can be implemented with shared filters across a dilated causal convolution tree. Thus it garners the computational advantages of convolutional networks and the principled theoretical motivation of wavelet decompositions. Our MultiresLayer is straightforward to implement, requires significantly fewer parameters, and maintains at most a (NlogN) memory footprint for a length N sequence. Yet, by stacking such layers, our model yields state-of-the-art performance on a number of sequence classification and autoregressive density estimation tasks using CIFAR-10, ListOps, and PTB-XL datasets. 
    more » « less
  8. null (Ed.)
  9. Importance Continuous glucose monitoring (CGM) is associated with improvements in hemoglobin A 1c (HbA 1c ) in youths with type 1 diabetes (T1D); however, youths from minoritized racial and ethnic groups and those with public insurance face greater barriers to CGM access. Early initiation of and access to CGM may reduce disparities in CGM uptake and improve diabetes outcomes. Objective To determine whether HbA 1c decreases differed by ethnicity and insurance status among a cohort of youths newly diagnosed with T1D and provided CGM. Design, Setting, and Participants This cohort study used data from the Teamwork, Targets, Technology, and Tight Control (4T) study, a clinical research program that aims to initiate CGM within 1 month of T1D diagnosis. All youths with new-onset T1D diagnosed between July 25, 2018, and June 15, 2020, at Stanford Children’s Hospital, a single-site, freestanding children’s hospital in California, were approached to enroll in the Pilot-4T study and were followed for 12 months. Data analysis was performed and completed on June 3, 2022. Exposures All eligible participants were offered CGM within 1 month of diabetes diagnosis. Main Outcomes and Measures To assess HbA 1c change over the study period, analyses were stratified by ethnicity (Hispanic vs non-Hispanic) or insurance status (public vs private) to compare the Pilot-4T cohort with a historical cohort of 272 youths diagnosed with T1D between June 1, 2014, and December 28, 2016. Results The Pilot-4T cohort comprised 135 youths, with a median age of 9.7 years (IQR, 6.8-12.7 years) at diagnosis. There were 71 boys (52.6%) and 64 girls (47.4%). Based on self-report, participants’ race was categorized as Asian or Pacific Islander (19 [14.1%]), White (62 [45.9%]), or other race (39 [28.9%]); race was missing or not reported for 15 participants (11.1%). Participants also self-reported their ethnicity as Hispanic (29 [21.5%]) or non-Hispanic (92 [68.1%]). A total of 104 participants (77.0%) had private insurance and 31 (23.0%) had public insurance. Compared with the historical cohort, similar reductions in HbA 1c at 6, 9, and 12 months postdiagnosis were observed for Hispanic individuals (estimated difference, −0.26% [95% CI, −1.05% to 0.43%], −0.60% [−1.46% to 0.21%], and −0.15% [−1.48% to 0.80%]) and non-Hispanic individuals (estimated difference, −0.27% [95% CI, −0.62% to 0.10%], −0.50% [−0.81% to −0.11%], and −0.47% [−0.91% to 0.06%]) in the Pilot-4T cohort. Similar reductions in HbA 1c at 6, 9, and 12 months postdiagnosis were also observed for publicly insured individuals (estimated difference, −0.52% [95% CI, −1.22% to 0.15%], −0.38% [−1.26% to 0.33%], and −0.57% [−2.08% to 0.74%]) and privately insured individuals (estimated difference, −0.34% [95% CI, −0.67% to 0.03%], −0.57% [−0.85% to −0.26%], and −0.43% [−0.85% to 0.01%]) in the Pilot-4T cohort. Hispanic youths in the Pilot-4T cohort had higher HbA 1c at 6, 9, and 12 months postdiagnosis than non-Hispanic youths (estimated difference, 0.28% [95% CI, −0.46% to 0.86%], 0.63% [0.02% to 1.20%], and 1.39% [0.37% to 1.96%]), as did publicly insured youths compared with privately insured youths (estimated difference, 0.39% [95% CI, −0.23% to 0.99%], 0.95% [0.28% to 1.45%], and 1.16% [−0.09% to 2.13%]). Conclusions and Relevance The findings of this cohort study suggest that CGM initiation soon after diagnosis is associated with similar improvements in HbA 1c for Hispanic and non-Hispanic youths as well as for publicly and privately insured youths. These results further suggest that equitable access to CGM soon after T1D diagnosis may be a first step to improve HbA 1c for all youths but is unlikely to eliminate disparities entirely. Trial Registration ClinicalTrials.gov Identifier: NCT04336969 
    more » « less