skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Universal consistency of wide and deep ReLU neural networks and minimax optimal convergence rates for Kolmogorov-Donoho optimal function classes
Award ID(s):
2229876
PAR ID:
10525483
Author(s) / Creator(s):
;
Publisher / Repository:
International Conference on Machine Learning (ICML 2024)
Date Published:
Format(s):
Medium: X
Location:
Vienna, Austria
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract In the absence of drugs and vaccines, policymakers use non-pharmaceutical interventions such as social distancing to decrease rates of disease-causing contact, with the aim of reducing or delaying the epidemic peak. These measures carry social and economic costs, so societies may be unable to maintain them for more than a short period of time. Intervention policy design often relies on numerical simulations of epidemic models, but comparing policies and assessing their robustness demands clear principles that apply across strategies. Here we derive the theoretically optimal strategy for using a time-limited intervention to reduce the peak prevalence of a novel disease in the classic Susceptible-Infectious-Recovered epidemic model. We show that broad classes of easier-to-implement strategies can perform nearly as well as the theoretically optimal strategy. But neither the optimal strategy nor any of these near-optimal strategies is robust to implementation error: small errors in timing the intervention produce large increases in peak prevalence. Our results reveal fundamental principles of non-pharmaceutical disease control and expose their potential fragility. For robust control, an intervention must be strong, early, and ideally sustained. 
    more » « less
  2. null (Ed.)
  3. Daumé, Hal III; Singh, Aarti (Ed.)
  4. This paper considers the fundamental convergence time for opportunistic scheduling over time-varying channels. The channel state probabilities are unknown and algorithms must perform some type of estimation and learning while they make decisions to optimize network utility. Existing schemes can achieve a utility within ε of optimality, for any desired ε > 0, with convergence and adaptation times of O(1/ε^2). This paper shows that if the utility function is concave and smooth, then O(log(1/ε)/ε) convergence time is possible via an existing stochastic variation on the Frank-Wolfe algorithm, called the RUN algorithm. Next, a converse result is proven to show it is impossible for any algorithm to have convergence time better than O(1/ε), provided the algorithm has no a- priori knowledge of channel state probabilities. Hence, RUN is within a logarithmic factor of convergence time optimality. However, RUN has a vanishing stepsize and hence has an infinite adaptation time. Using stochastic Frank-Wolfe with a fixed step- size yields improved O(1/ε^2) adaptation time, but convergence time increases to O(1/ε^2), similar to existing drift-plus-penalty based algorithms. This raises important open questions regarding optimal adaptation. 
    more » « less
  5. This paper considers the fundamental convergence time for opportunistic scheduling over time-varying channels. The channel state probabilities are unknown and algorithms must perform some type of estimation and learning while they make decisions to optimize network utility. Existing schemes can achieve a utility within ε of optimality, for any desired ε > 0, with convergence and adaptation times of O(1/ε^2). This paper shows that if the utility function is concave and smooth, then O(log(1/ε)/ε) convergence time is possible via an existing stochastic variation on the Frank-Wolfe algorithm, called the RUN algorithm. Next, a converse result is proven to show it is impossible for any algorithm to have convergence time better than O(1/ε), provided the algorithm has no a- priori knowledge of channel state probabilities. Hence, RUN is within a logarithmic factor of convergence time optimality. However, RUN has a vanishing stepsize and hence has an infinite adaptation time. Using stochastic Frank-Wolfe with a fixed step- size yields improved O(1/ε^2) adaptation time, but convergence time increases to O(1/ε^2), similar to existing drift-plus-penalty based algorithms. This raises important open questions regarding optimal adaptation. 
    more » « less