skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, July 12 until 9:00 AM ET on Saturday, July 13 due to maintenance. We apologize for the inconvenience.

Title: Taming Fat-Tailed (“Heavier-Tailed” with Potentially Infinite Variance) Noise in Federated Learning
Award ID(s):
Author(s) / Creator(s):
; ;
Publisher / Repository:
Date Published:
Journal Name:
Proc. NeurIPS
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A novel statistical method is proposed and investigated for estimating a heavy tailed density under mildsmoothness assumptions. Statistical analyses of heavy-tailed distributions are susceptible to the problem ofsparse information in the tail of the distribution getting washed away by unrelated features of a hefty bulk.The proposed Bayesian method avoids this problem by incorporating smoothness and tail regularizationthrough a carefully specified semiparametric prior distribution, and is able to consistently estimate boththe density function and its tail index at near minimax optimal rates of contraction. A joint, likelihood drivenestimation of the bulk and the tail is shown to help improve uncertainty assessment in estimating the tailindex parameter and offer more accurate and reliable estimates of the high tail quantiles compared tothresholding methods. Supplementary materials for this article are available online. 
    more » « less
  2. We consider the task of heavy-tailed statistical estimation given streaming p-dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional O(p) space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gradients, which we show is critical when analyzing stochastic optimization problems arising from general statistical estimation problems. Our results guarantee convergence not just in expectation but with exponential concentration, and moreover does so using O(1) batch size. We provide consequences of our results for mean estimation and linear regression. Finally, we provide empirical corroboration of our results and algorithms via synthetic experiments for mean estimation and linear regression. 
    more » « less