NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Discovering Data Structures: Nearest Neighbor Search and Beyond

Salemohamed, Omar; Charlin, Laurent; Garg, Shivam; Sharan, Vatsal; Valiant, Gregory (December 2025, Advances in Neural Information Processing Systems)

Free, publicly-accessible full text available December 1, 2026
On the statistical complexity of sample amplification

https://doi.org/10.1214/24-AOS2444

Axelrod, Brian; Garg, Shivam; Han, Yanjun; Sharan, Vatsal; Valiant, Gregory (December 2024, The Annals of Statistics)

Full Text Available
Machine Unlearning via Simulated Oracle Matching

Georgiev, Kristian; Rinberg, Roy; Park, Sung Min; Garg, Shivam; Ilyas, Andrew; Madry, Aleksander; Neel, Seth (July 2024, ICML GenLaw Workshop (https://www.genlaw.org/2024-icml))

Machine unlearning---efficiently removing the effect of a small "forget set" of training data on a pre-trained machine learning model---has recently attracted significant research interest. Despite this interest, however, recent work shows that existing machine unlearning techniques do not hold up to thorough evaluation in non-convex settings. In this work, we introduce a new machine unlearning technique that exhibits strong empirical performance even in such challenging settings. Our starting point is the perspective that the goal of unlearning is to produce a model whose outputs are statistically indistinguishable from those of a model re-trained on all but the forget set. This perspective naturally suggests a reduction from the unlearning problem to that of *data attribution, where the goal is to predict the effect of changing the training set on a model's outputs. Thus motivated, we propose the following meta-algorithm, which we call Datamodel Matching (DMM): given a trained model, we (a) use data attribution to predict the output of the model if it were re-trained on all but the forget set points; then (b) fine-tune the pre-trained model to match these predicted outputs. In a simple convex setting, we show how this approach provably outperforms a variety of iterative unlearning algorithms. Empirically, we use a combination of existing evaluations and a new metric based on the KL-divergence to show that even in non-convex settings, DMM achieves strong unlearning performance relative to existing algorithms. An added benefit of DMM is that it is a meta-algorithm, in the sense that future advances in data attribution translate directly into better unlearning algorithms, pointing to a clear direction for future progress in unlearning.
more » « less
Full Text Available
AAV6 mediated Gsx1 expression in neural stem progenitor cells promotes neurogenesis and restores locomotor function after contusion spinal cord injury

https://doi.org/10.1016/j.neurot.2024.e00362

Finkel, Zachary; Esteban, Fatima; Rodriguez, Brianna; Clifford, Tanner; Joseph, Adelina; Alostaz, Hani; Dalmia, Mridul; Gutierrez, Juan; Tamasi, Matthew J; Zhang, Samuel Ming; et al (July 2024, Neurotherapeutics)

Full Text Available
Distributed algorithms from arboreal ants for the shortest path problem

https://doi.org/10.1073/pnas.2207959120

Garg, Shivam; Shiragur, Kirankumar; Gordon, Deborah M.; Charikar, Moses (February 2023, Proceedings of the National Academy of Sciences)

Colonies of the arboreal turtle ant create networks of trails that link nests and food sources on the graph formed by branches and vines in the canopy of the tropical forest. Ants put down a volatile pheromone on the edges as they traverse them. At each vertex, the next edge to traverse is chosen using a decision rule based on the current pheromone level. There is a bidirectional flow of ants around the network. In a previous field study, it was observed that the trail networks approximately minimize the number of vertices, thus solving a variant of the popular shortest path problem without any central control and with minimal computational resources. We propose a biologically plausible model, based on a variant of the reinforced random walk on a graph, which explains this observation and suggests surprising algorithms for the shortest path problem and its variants. Through simulations and analysis, we show that when the rate of flow of ants does not change, the dynamics converges to the path with the minimum number of vertices, as observed in the field. The dynamics converges to the shortest path when the rate of flow increases with time, so the colony can solve the shortest path problem merely by increasing the flow rate. We also show that to guarantee convergence to the shortest path, bidirectional flow and a decision rule dividing the flow in proportion to the pheromone level are necessary, but convergence to approximately short paths is possible with other decision rules.
more » « less
Full Text Available
How and When Random Feedback Works: A Case Study of Low-Rank Matrix Factorization

Garg, Shivam; Vempala, Santosh (January 2022, International Conference on Artificial Intelligence and Statistics (AISTATS))

Full Text Available
How and When Random Feedback Works: A Case Study of Low-Rank Matrix Factorization.

Garg, Shivam; Vempala, Santosh S. (January 2022, AISTATS)

The success of gradient descent in ML and especially for learning neural networks is remarkable and robust. In the context of how the brain learns, one aspect of gradient descent that appears biologically difficult to realize (if not implausible) is that its updates rely on feedback from later layers to earlier layers through the same connections. Such bidirected links are relatively few in brain networks, and even when reciprocal connections exist, they may not be equi-weighted. Random Feedback Alignment (Lillicrap et al., 2016), where the backward weights are random and fixed, has been proposed as a bio-plausible alternative and found to be effective empirically. We investigate how and when feedback alignment (FA) works, focusing on one of the most basic problems with layered structure n×m, the goal is to find a low rank factorization Zn×rWr×m that minimizes the error ∥ZW−Y∥F. Gradient descent solves this problem optimally. We show that FA finds the optimal solution when r≥rank(Y). We also shed light on how FA works. It is observed empirically that the forward weight matrices and (random) feedback matrices come closer during FA updates. Our analysis rigorously derives this phenomenon and shows how it facilitates convergence of FA*, a closely related variant of FA. We also show that FA can be far from optimal when r
more » « less
Full Text Available
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes

Garg, Shivam; Tsipras, Dimitris; Liang, Percy; Valiant, Gregory (January 2022, Advances in Neural Information Processing Systems 35 (NeurIPS 2022))

Full Text Available
A Model for Ant Trail Formation and its Convergence Properties

https://doi.org/10.4230/LIPIcs.ITCS.2021.85

Charikar, Moses; Garg, Shivam; Gordon, Deborah M.; Shiragur, Kirankumar (January 2021, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021))
null (Ed.)
We introduce a model for ant trail formation, building upon previous work on biologically feasible local algorithms that plausibly describe how ants maintain trail networks. The model is a variant of a reinforced random walk on a directed graph, where ants lay pheromone on edges as they traverse them and the next edge to traverse is chosen based on the level of pheromone; this pheromone decays with time. There is a bidirectional flow of ants in the network: the forward flow proceeds along forward edges from source (e.g. the nest) to sink (e.g. a food source), and the backward flow in the opposite direction. Some fraction of ants are lost as they pass through each node (modeling the loss of ants due to exploration observed in the field). We initiate a theoretical study of this model. We note that ant navigation has inspired the field of ant colony optimization, heuristics that have been applied to several combinatorial optimization problems; however the algorithms developed there are considerably more complex and not constrained to being biologically feasible. We first consider the linear decision rule, where the flow divides itself among the next set of edges in proportion to their pheromone level. Here, we show that the process converges to the path with minimum leakage when the forward and backward flows do not change over time. On the other hand, when the forward and backward flows increase over time (caused by positive reinforcement from the discovery of a food source, for example), we show that the process converges to the shortest path. These results are for graphs consisting of two parallel paths (a case that has been investigated before in experiments). Through simulations, we show that these results hold for more general graphs drawn from various random graph models; proving this convergence in the general case is an interesting open problem. Further, to understand the behaviour of other decision rules beyond the linear rule, we consider a general family of decision rules. For this family, we show that there is no advantage of using a non-linear decision rule, if the goal is to find the shortest or the minimum leakage path. We also show that bidirectional flow is necessary for convergence to such paths. Our results provide a plausible explanation for field observations, and open up new avenues for further theoretical and experimental investigation.
more » « less
Full Text Available
Sample Amplification: Increasing Dataset Size even when Learning is Impossible, ICML

Axelrod, Brian; Garg, Shivam; Sharan, Vatsal; Gregory, Valiant (July 2020, Proceedings of Machine Learning Research)

Given data drawn from an unknown distribution, D, to what extent is it possible to amplify'' this dataset and faithfully output an even larger set of samples that appear to have been drawn from D? We formalize this question as follows: an (n,m) amplification procedure takes as input n independent draws from an unknown distribution D, and outputs a set of m > n samples'' which must be indistinguishable from m samples drawn iid from D. We consider this sample amplification problem in two fundamental settings: the case where D is an arbitrary discrete distribution supported on k elements, and the case where D is a d-dimensional Gaussian with unknown mean, and fixed covariance matrix. Perhaps surprisingly, we show a valid amplification procedure exists for both of these settings, even in the regime where the size of the input dataset, n, is significantly less than what would be necessary to learn distribution D to non-trivial accuracy. We also show that our procedures are optimal up to constant factors. Beyond these results, we describe potential applications of such data amplification, and formalize a number of curious directions for future research along this vein.
more » « less
Full Text Available

« Prev Next »

Search for: All records