We provide a global, long-term carbon flux dataset of gross primary production and ecosystem respiration generated using meta-learning, called
- PAR ID:
- 10430996
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Scientific Data
- Volume:
- 10
- Issue:
- 1
- ISSN:
- 2052-4463
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Learning-to-learn (using optimization algorithms to learn a new optimizer) has successfully trained efficient optimizers in practice. This approach relies on meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates. However, there were few theoretical guarantees on how to avoid meta-gradient explosion/vanishing problems, or how to train an optimizer with good generalization performance. In this paper, we study the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that although there is a way to design the meta-objective so that the meta-gradient remain polynomially bounded, computing the meta-gradient directly using backpropagation leads to numerical issues that look similar to gradient explosion/vanishing problems. We also characterize when it is necessary to compute the meta-objective on a separate validation set instead of the original training set. Finally, we verify our results empirically and show that a similar phenomenon appears even for more complicated learned optimizers parametrized by neural networks.more » « less
-
null (Ed.)Learning-to-learn – using optimization algorithms to learn a new optimizer – has successfully trained efficient optimizers in practice. This approach relies on meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates. However, there were few theoretical guarantees on how to avoid meta-gradient explosion/vanishing problems, or how to train an optimizer with good generalization performance. In this paper we study the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that although there is a way to design the meta-objective so that the meta-gradient remain polynomially bounded, computing the meta-gradient directly using backpropagation leads to numerical issues that look similar to gradient explosion/vanishing problems. We also characterize when it is necessary to compute the meta-objective on a separate validation set instead of the original training set. Finally, we verify our results empirically and show that a similar phenomenon appears even for more complicated learned optimizers parametrized by neural networks.more » « less
-
Abstract Adaptive ‘life-long’ learning at the edge and during online task performance is an aspirational goal of artificial intelligence research. Neuromorphic hardware implementing spiking neural networks (SNNs) are particularly attractive in this regard, as their real-time, event-based, local computing paradigm makes them suitable for edge implementations and fast learning. However, the long and iterative learning that characterizes state-of-the-art SNN training is incompatible with the physical nature and real-time operation of neuromorphic hardware. Bi-level learning, such as meta-learning is increasingly used in deep learning to overcome these limitations. In this work, we demonstrate gradient-based meta-learning in SNNs using the surrogate gradient method that approximates the spiking threshold function for gradient estimations. Because surrogate gradients can be made twice differentiable, well-established, and effective second-order gradient meta-learning methods such as model agnostic meta learning (MAML) can be used. We show that SNNs meta-trained using MAML perform comparably to conventional artificial neural networks meta-trained with MAML on event-based meta-datasets. Furthermore, we demonstrate the specific advantages that accrue from meta-learning: fast learning without the requirement of high precision weights or gradients, training-to-learn with quantization and mitigating the effects of approximate synaptic plasticity rules. Our results emphasize how meta-learning techniques can become instrumental for deploying neuromorphic learning technologies on real-world problems.
-
Abstract Examination of the reactions of σ‐type quinolinium‐based triradicals with cyclohexane in the gas phase demonstrated that the radical site that is the least strongly coupled to the other two radical sites reacts first, independent of the intrinsic reactivity of this radical site, in contrast to related biradicals that first react at the most electron‐deficient radical site. Abstraction of one or two H atoms and formation of an ion that formally corresponds to a combination of the ion and cyclohexane accompanied by elimination of a H atom (“addition‐H”) were observed. In all cases except one, the most reactive radical site of the triradicals is intrinsically less reactive than the other two radical sites. The product complex of the first H atom abstraction either dissociates to give the H‐atom‐abstraction product and the cyclohexyl radical or the more reactive radical site in the produced biradical abstracts a H atom from the cyclohexyl radical. The monoradical product sometimes adds to cyclohexene followed by elimination of a H atom, generating the “addition‐H” products. Similar reaction efficiencies were measured for three of the triradicals as for relevant monoradicals. Surprisingly, the remaining three triradicals (all containing a
meta‐ pyridyne moiety) reacted substantially faster than the relevant monoradicals. This is likely due to the exothermic generation of ameta‐ pyridyne analog that has enough energy to attain the dehydrocarbon atom separation common for H‐atom‐abstraction transition states of protonatedmeta‐ pyridynes. -
Abstract Deep learning (DL) models trained on hydrologic observations can perform extraordinarily well, but they can inherit deficiencies of the training data, such as limited coverage of in situ data or low resolution/accuracy of satellite data. Here we propose a novel multiscale DL scheme learning simultaneously from satellite and in situ data to predict 9 km daily soil moisture (5 cm depth). Based on spatial cross‐validation over sites in the conterminous United States, the multiscale scheme obtained a median correlation of 0.901 and root‐mean‐square error of 0.034 m3/m3. It outperformed the Soil Moisture Active Passive satellite mission's 9 km product, DL models trained on in situ data alone, and land surface models. Our 9 km product showed better accuracy than previous 1 km satellite downscaling products, highlighting limited impacts of improving resolution. Not only is our product useful for planning against floods, droughts, and pests, our scheme is generically applicable to geoscientific domains with data on multiple scales, breaking the confines of individual data sets.