NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

HDP-Flow: Generalizable Bayesian Nonparametric Model for Time Series State Discovery

Tonekaboni, Sana; Behrouzi, Tina; Weatherhead, Addison; Fox, Emily; Blei, David; Goldenberg, Anna (July 2025, PMLR (Conference on Uncertainty in Artificial Intelligence))

Free, publicly-accessible full text available July 21, 2026
Learning Explainable Treatment Policies with Clinician-Informed Representations: A Practical Approach

Ferstad, Johannes O; Fox, Emily B; Scheinker, David; Johari, Ramesh (December 2024, Proceedings of Machine Learning Research (Machine Learning for Health))

Full Text Available
Fréchet Edit Distance

Fox, Emily; Nayyeri, Amir; Perry, Jonathan James; Raichel, Benjmain (June 2024, Proceedings of the 40th International Symposium on Computational Geometry)

In this paper we define and investigate the Fréchet edit distance problem. Here, given two polygonal curves $$\pi$$ and $$\sigma$$ and a threshhold value $$\delta$$ , we seek the minimum number of edits to $$\sigma$$ such that the Fréchet distance between the edited curve and $$\pi$$ is at most $$\delta$$. For the edit operations we consider three cases, namely, deletion of vertices, insertion of vertices, or both. For this basic problem we consider a number of variants. Specifically, we provide polynomial time algorithms for both discrete and continuous Fréchet edit distance variants, as well as hardness results for weak Fréchet edit distance variants.
more » « less
Fréchet Edit Distance

https://doi.org/10.4230/LIPIcs.SoCG.2024.58

Fox, Emily; Nayyeri, Amir; Perry, Jonathan James; Raichel, Benjamin (June 2024, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
Mulzer, Wolfgang; Phillips, Jeff M (Ed.)
We define and investigate the Fréchet edit distance problem. Given two polygonal curves π and σ and a non-negative threshhold value δ, we seek the minimum number of edits to σ such that the Fréchet distance between the edited σ and π is at most δ. For the edit operations we consider three cases, namely, deletion of vertices, insertion of vertices, or both. For this basic problem we consider a number of variants. Specifically, we provide polynomial time algorithms for both discrete and continuous Fréchet edit distance variants, as well as hardness results for weak Fréchet edit distance variants.
more » « less
Full Text Available
Hybrid^2 Neural ODE Causal Modeling and an Application to Glycemic Response

Zou, Bob J; Levine, Matthew E; Zaharieva, Dessi P; Johari, Ramesh; Fox, Emily B (July 2024, 41st International Conference on Machine Learning (ICML))

Hybrid models composing mechanistic ODE- based dynamics with flexible and expressive neural network components have grown rapidly in popularity, especially in scientific domains where such ODE-based modeling offers important interpretability and validated causal grounding (e.g., for counterfactual reasoning). The incorporation of mechanistic models also provides inductive bias in standard blackbox modeling approaches, critical when learning from small datasets or partially observed, complex systems. Unfortunately, as the hybrid models become more flexible, the causal grounding provided by the mechanistic model can quickly be lost. We address this problem by leveraging another common source of domain knowledge: ranking of treatment effects for a set of interventions, even if the precise treatment effect is unknown. We encode this information in a causal loss that we combine with the standard predictive loss to arrive at a hybrid loss that biases our learning towards causally valid hybrid models. We demonstrate our ability to achieve a win-win, state-of-the-art predictive performance and causal validity, in the challenging task of modeling glucose dynamics post-exercise in individuals with type 1 diabetes.
more » « less
Full Text Available
A Simple Deterministic Near-Linear Time Approximation Scheme for Transshipment with Arbitrary Positive Edge Costs

https://doi.org/10.4230/LIPIcs.ESA.2024.56

Fox, Emily (January 2024, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
Chan, Timothy; Fischer, Johannes; Iacono, John; Herman, Grzegorz (Ed.)
We describe a simple deterministic near-linear time approximation scheme for uncapacitated minimum cost flow in undirected graphs with positive real edge weights, a problem also known as transshipment. Specifically, our algorithm takes as input a (connected) undirected graph G = (V, E), vertex demands b ∈ R^V such that ∑_{v ∈ V} b(v) = 0, positive edge costs c ∈ R_{≥ 0}^E, and a parameter ε > 0. In O(ε^{-2} m log^{O(1)} n) time, it returns a flow f such that the net flow out of each vertex is equal to the vertex’s demand and the cost of the flow is within a (1 ± ε) factor of optimal. Our algorithm is combinatorial and has no running time dependency on the demands or edge costs. With the exception of a recent result presented at STOC 2022 for polynomially bounded edge weights, all almost- and near-linear time approximation schemes for transshipment relied on randomization to embed the problem instance into low-dimensional space. Our algorithm instead deterministically approximates the cost of routing decisions that would be made if the input were subject to a random tree embedding. To avoid computing the Ω(n²) vertex-vertex distances that an approximation of this kind suggests, we also take advantage of the clustering method used in the well-known Thorup-Zwick distance oracle.
more » « less
Full Text Available
Clustering with faulty centers

https://doi.org/10.1016/j.comgeo.2023.102052

Fox, Emily; Huang, Hongyao; Raichel, Benjamin (February 2024, Computational Geometry)

In this paper we introduce and formally study the problem of $$k$$-clustering with faulty centers. Specifically, we study the faulty versions of $$k$$-center, $$k$$-median, and $$k$$-means clustering, where centers have some probability of not existing, as opposed to prior work where clients had some probability of not existing. For all three problems we provide fixed parameter tractable algorithms, in the parameters $$k$$, $$d$$, and $$\eps$$, that $$(1+\eps)$$-approximate the minimum expected cost solutions for points in $$d$$ dimensional Euclidean space. For Faulty $$k$$-center we additionally provide a 5-approximation for general metrics. Significantly, all of our algorithms have only a linear dependence on $$n$$.
more » « less
Full Text Available
A deterministic near-linear time approximation scheme for geometric transportation

https://doi.org/10.1109/FOCS57990.2023.00078

Fox, Emily; Lu, Jiashuai (November 2023, IEEE)

Given a set of points $$P = (P^+ \sqcup P^-) \subset \mathbb{R}^d$$ for some constant $$d$$ and a supply function $$\mu:P\to \mathbb{R}$$ such that $$\mu(p) > 0~\forall p \in P^+$$, $$\mu(p) < 0~\forall p \in P^-$$, and $$\sum_{p\in P}{\mu(p)} = 0$$, the geometric transportation problem asks one to find a transportation map $$\tau: P^+\times P^-\to \mathbb{R}_{\ge 0}$$ such that $$\sum_{q\in P^-}{\tau(p, q)} = \mu(p)~\forall p \in P^+$$, $$\sum_{p\in P^+}{\tau(p, q)} = -\mu(q) \forall q \in P^-$$, and the weighted sum of Euclidean distances for the pairs $$\sum_{(p,q)\in P^+\times P^-}\tau(p, q)\cdot ||q-p||_2$$ is minimized. We present the first deterministic algorithm that computes, in near-linear time, a transportation map whose cost is within a $$(1 + \varepsilon)$$ factor of optimal. More precisely, our algorithm runs in $$O(n\varepsilon^{-(d+2)}\log^5{n}\log{\log{n}})$$ time for any constant $$\varepsilon > 0$$. While a randomized $$n\varepsilon^{-O(d)}\log^{O(d)}{n}$$ time algorithm for this problem was discovered in the last few years, all previously known deterministic $$(1 + \varepsilon)$$-approximation algorithms run in~$$\Omega(n^{3/2})$$ time. A similar situation existed for geometric bipartite matching, the special case of geometric transportation where all supplies are unit, until a deterministic $$n\varepsilon^{-O(d)}\log^{O(d)}{n}$$ time $$(1 + \varepsilon)$$-approximation algorithm was presented at STOC 2022. Surprisingly, our result is not only a generalization of the bipartite matching one to arbitrary instances of geometric transportation, but it also reduces the running time for all previously known $$(1 + \varepsilon)$$-approximation algorithms, randomized or deterministic, even for geometric bipartite matching. In particular, we give the first $$(1 + \varepsilon)$$-approximate deterministic algorithm for geometric bipartite matching and the first $$(1 + \varepsilon)$$-approximate deterministic or randomized algorithm for geometric transportation with no dependence on $$d$$ in the exponent of the running time's polylog. As an additional application of our main ideas, we also give the first randomized near-linear $$O(\varepsilon^{-2} m \log^{O(1)} n)$$ time $$(1 + \varepsilon)$$-approximation algorithm for the uncapacitated minimum cost flow (transshipment) problem in undirected graphs with arbitrary \emph{real} edge costs.
more » « less
Full Text Available
Smart Start — Designing Powerful Clinical Trials Using Pilot Study Data

https://doi.org/10.1056/EVIDoa2300164

Ferstad, Johannes O; Prahalad, Priya; Maahs, David M; Zaharieva, Dessi P; Fox, Emily; Desai, Manisha; Johari, Ramesh; Scheinker, David (January 2024, NEJM Evidence)

Full Text Available
Sequence Modeling with Multiresolution Convolutional Memory

Xi, Jiaxin; Wang, Ke Alex; Fox, Emily B. (January 2023, International Conference on Machine Learning (ICML))

Efficiently capturing the long-range patterns in sequential data sources salient to a given task -- such as classification and generative modeling -- poses a fundamental challenge. Popular approaches in the space tradeoff between the memory burden of brute-force enumeration and comparison, as in transformers, the computational burden of complicated sequential dependencies, as in recurrent neural networks, or the parameter burden of convolutional networks with many or large filters. We instead take inspiration from wavelet-based multiresolution analysis to define a new building block for sequence modeling, which we call a MultiresLayer. The key component of our model is the multiresolution convolution, capturing multiscale trends in the input sequence. Our MultiresConv can be implemented with shared filters across a dilated causal convolution tree. Thus it garners the computational advantages of convolutional networks and the principled theoretical motivation of wavelet decompositions. Our MultiresLayer is straightforward to implement, requires significantly fewer parameters, and maintains at most a (NlogN) memory footprint for a length N sequence. Yet, by stacking such layers, our model yields state-of-the-art performance on a number of sequence classification and autoregressive density estimation tasks using CIFAR-10, ListOps, and PTB-XL datasets.
more » « less
Full Text Available

« Prev Next »

Search for: All records