Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizers by data-driven approaches. However, current L2O methods often suffer from poor generalization performance in at least two folds: (i) applying the L2O-learned optimizer to unseen optimizees, in terms of lowering their loss function values (optimizer generalization, or “generalizable learning of optimizers”); and (ii) the test performance of an optimizee (itself as a machine learning model), trained by the optimizer, in terms of the accuracy over unseen data (optimizee generalization, or “learning to generalize”). While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper. We first theoretically establish an implicit connection between the local entropy and the Hessian, and hence unify their roles in the handcrafted design of generalizable optimizers as equivalent metrics of the landscape flatness of loss functions. We then propose to incorporate these two metrics as flatness-aware regularizers into the L2O framework in order to meta-train optimizers to learn to generalize, and theoretically show that such generalization ability can be learned during the L2O meta-training process and then transformed to the optimizee loss function. Extensive experiments consistently validate the effectiveness of our proposals with substantially improved generalization on multiple sophisticated L2O models and diverse optimizees.
more »
« less
Humans in the Loop: Learning to Trust in AI but to What Extent?
- Award ID(s):
- 1828010
- PAR ID:
- 10277380
- Date Published:
- Journal Name:
- IEEE Transactions on Technology and Society
- Volume:
- 1
- Issue:
- 4
- ISSN:
- 2637-6415
- Page Range / eLocation ID:
- 174 to 174
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract We show that the entry of formal financial institutions can have far-reaching and long-lasting impacts on informal lending and social networks more generally. We first study the introduction of microfinance in 75 villages in Karnataka, India, 43 of which were exposed to microfinance. Using difference-in-differences, we show that networks shrank more in exposed villages. Moreover, links between households that were both unlikely to borrow from microfinance were at least as likely to disappear as links involving likely borrowers. We replicate these surprising findings in the context of a randomised controlled trial (RCT) in Hyderabad, where a microfinance institution randomly selected 52 of 104 neighbourhoods to enter first. Four years after all neighbourhoods were treated, households in early-entry neighbourhoods had credit access longer and had larger loans. We again find fewer social relationships between households in these neighbourhoods, even among those ex-ante unlikely to borrow. Because the results suggest global spillovers, atypical in usual models of network formation, we develop a new dynamic model of network formation that emphasizes chance meetings, where efforts to socialize generate a global network-level externality. Finally, we analyse informal borrowing and the sensitivity of consumption to income fluctuations. Households unlikely to take up microcredit suffer the greatest loss of informal borrowing and risk sharing, underscoring the global nature of the externality.more » « less
An official website of the United States government

