NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Gemstones: A Model Suite for Multi-Faceted Scaling Laws

McLeish, Sean; Kirchenbauer, John; Miller, David Yu; Singh, Siddharth; Bhatele, Abhinav; Goldblum, Micah; Panda, Ashwinee; Goldstein, Tom (February 2025, ArXiv)

Free, publicly-accessible full text available February 7, 2026
Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models

Jain, Neel; Shrivastava, Aditya; Zhu, Chenyang; Liu, Daben; Samuel, Alfy; Panda, Ashwinee; Kumar, Anoop; Goldblum, Micah; Goldstein, Tom (December 2024, ArXiv)

Full Text Available
FetchSGD: Communication-Efficient Federated Learning with Sketching

Rothchild, Daniel; Panda, Ashwinee; Ullah, Enayat; Ivkin, Nikita; Stoica, Ion; Braverman, Vladimir; Gonzalez, Joseph; Arora, Raman (July 2020, Proceedings of Machine Learning Research)

Full Text Available
FetchSGD: Communication-Efficient Federated Learning with Sketching.

Rothchild, Daniel; Panda, Ashwinee; Ullah, Enayat; Ivkin, Nikita; Stoica, Ion; Braverman, Vladimir; Gonzalez, Joseph; Arora, Raman (July 2020, Proceedings of Machine Learning Research)
null (Ed.)
Existing approaches to federated learning suffer from a communication bottleneck as well as convergence issues due to sparse client participation. In this paper we introduce a novel algorithm, called FetchSGD, to overcome these challenges. FetchSGD compresses model updates using a Count Sketch, and then takes advantage of the merge-ability of sketches to combine model updates from many workers. A key insight in the design of FetchSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch. This allows the algorithm to move momentum and error accumulation from clients to the central aggregator, overcoming the challenges of sparse client participation while still achieving high compression rates and good convergence. We prove that FetchSGD has favorable convergence guarantees, and we demonstrate its empirical effectiveness by training two residual networks and a transformer model.
more » « less
Full Text Available
FetchSGD: Communication-Efficient Federated Learning with Sketching

Rothchild, Daniel; Panda, Ashwinee; Ullah, Enayat; Ivkin, Nikita; Stoica, Ion; Braverman, Vladimir; Gonzalez, Joseph; Arora, Raman (July 2020, Proceedings of Machine Learning Research)
null (Ed.)
Existing approaches to federated learning suffer from a communication bottleneck as well as convergence issues due to sparse client participation. In this paper we introduce a novel algorithm, called FetchSGD, to overcome these challenges. FetchSGD compresses model updates using a Count Sketch, and then takes advantage of the merge-ability of sketches to combine model updates from many workers. A key insight in the design of FetchSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch. This allows the algorithm to move momentum and error accumulation from clients to the central aggregator, overcoming the challenges of sparse client participation while still achieving high compression rates and good convergence. We prove that FetchSGD has favorable convergence guarantees, and we demonstrate its empirical effectiveness by training two residual networks and a transformer model.
more » « less
Full Text Available
FetchSGD: Communication-Efficient Federated Learning with Sketching

Rothchild, Daniel; Panda, Ashwinee; Ullah, Enayat; Ivkin, Nikita; Stoica, Ion; Braverman, Vladimir; Gonzalez, Joseph; Arora, Raman (July 2020, International Conference on Machine Learning)
null (Ed.)
Existing approaches to federated learning suffer from a communication bottleneck as well as convergence issues due to sparse client participation. In this paper we introduce a novel algorithm, called FetchSGD, to overcome these challenges. FetchSGD compresses model updates using a Count Sketch, and then takes advantage of the merge-ability of sketches to combine model updates from many workers. A key insight in the design of FetchSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch. This allows the algorithm to move momentum and error accumulation from clients to the central aggregator, overcoming the challenges of sparse client participation while still achieving high compression rates and good convergence. We prove that FetchSGD has favorable convergence guarantees, and we demonstrate its empirical effectiveness by training two residual networks and a transformer model.
more » « less
Full Text Available
FetchSGD: Communication-Efficient Federated Learning with Sketching

Rothchild, Daniel; Panda, Ashwinee; Ullah, Enayat; Ivkin, Nikita; Stoica, Ion; Braverman, Vladimir; Gonzalez, Joseph; Arora, Raman (January 2020, Proceedings of Machine Learning Research)

Full Text Available

Search for: All records