skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 6, 2026

Title: Mitigating mode collapse in normalizing flows by annealing with an adaptive schedule: Application to parameter estimation
Normalizing flows (NFs) provide uncorrelated samples from complex distributions, making them an appealing tool for parameter estimation. However, the practical utility of NFs remains limited by their tendency to collapse to a single mode of a multimodal distribution. In this study, we show that annealing with an adaptive schedule based on the effective sample size (ESS) can mitigate mode collapse. We demonstrate that our approach can converge the marginal likelihood for a biochemical oscillator model fit to time-series data in ten-fold less computation time than a widely used ensemble Markov chain Monte Carlo (MCMC) method. We show that the ESS can also be used to reduce variance by pruning the samples. We expect these developments to be of general use for sampling with NFs and discuss potential opportunities for further improvements.  more » « less
Award ID(s):
2235451
PAR ID:
10612010
Author(s) / Creator(s):
; ;
Publisher / Repository:
arXiv:2505.03652
Date Published:
Journal Name:
arXivorg
ISSN:
2331-8422
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Generative adversarial networks (GANs) are innovative techniques for learning generative models of complex data distributions from samples. Despite remarkable recent improvements in generating realistic images, one of their major shortcomings is the fact that in practice, they tend to produce samples with little diversity, even when trained on diverse datasets. This phenomenon, known as mode collapse, has been the main focus of several recent advances in GANs. Yet there is little understanding of why mode collapse happens and why recently proposed approaches are able to mitigate mode collapse. We propose a principled approach to handling mode collapse, which we call packing. The main idea is to modify the discriminator to make decisions based on multiple samples from the same class, either real or artificially generated. We borrow analysis tools from binary hypothesis testing—in particular the seminal result of Blackwell [6]—to prove a fundamental connection between packing and mode collapse. We show that packing naturally penalizes generators with mode collapse, thereby favoring generator distributions with less mode collapse during the training process. Numerical experiments on benchmark datasets suggests that packing provides significant improvements in practice as well. 
    more » « less
  2. Generative adversarial networks (GANs) are a technique for learning generative models of complex data distributions from samples. Despite remarkable advances in generating realistic images, a major shortcoming of GANs is the fact that they tend to produce samples with little diversity, even when trained on diverse datasets. This phenomenon, known as mode collapse, has been the focus of much recent work. We study a principled approach to handling mode collapse, which we call packing. The main idea is to modify the discriminator to make decisions based on multiple samples from the same class, either real or artificially generated. We draw analysis tools from binary hypothesis testing—in particular the seminal result of Blackwell [4]—to prove a fundamental connection between packing and mode collapse. We show that packing naturally penalizes generators with mode collapse, thereby favoring generator distributions with less mode collapse during the training process. Numerical experiments on benchmark datasets suggest that packing provides significant improvements. 
    more » « less
  3. Generative adversarial networks (GANs) are a class of machine-learning models that use adversarial training to generate new samples with the same (potentially very complex) statistics as the training samples. One major form of training failure, known as mode collapse, involves the generator failing to reproduce the full diversity of modes in the target probability distribution. Here, we present an effective model of GAN training, which captures the learning dynamics by replacing the generator neural network with a collection of particles in the output space; particles are coupled by a universal kernel valid for certain wide neural networks and high-dimensional inputs. The generality of our simplified model allows us to study the conditions under which mode collapse occurs. Indeed, experiments which vary the effective kernel of the generator reveal a mode collapse transition, the shape of which can be related to the type of discriminator through the frequency principle. Further, we find that gradient regularizers of intermediate strengths can optimally yield convergence through critical damping of the generator dynamics. Our effective GAN model thus provides an interpretable physical framework for understanding and improving adversarial training. 
    more » « less
  4. Abstract Radioactive nuclei were present in the early solar system (ESS), as inferred from analysis of meteorites. Many are produced in massive stars, either during their lives or their final explosions. In the first paper of this series (Brinkman et al. 2019), we focused on the production of 26 Al in massive binaries. Here, we focus on the production of another two short-lived radioactive nuclei, 36 Cl and 41 Ca, and the comparison to the ESS data. We used the MESA stellar evolution code with an extended nuclear network and computed massive (10–80 M ⊙ ), rotating (with initial velocities of 150 and 300 km s −1 ) and nonrotating single stars at solar metallicity ( Z = 0.014) up to the onset of core collapse. We present the wind yields for the radioactive isotopes 26 Al, 36 Cl, and 41 Ca, and the stable isotopes 19 F and 22 Ne. In relation to the stable isotopes, we find that only the most massive models, ≥60 and ≥40 M ⊙ give positive 19 F and 22 Ne yields, respectively, depending on the initial rotation rate. In relation to the radioactive isotopes, we find that the ESS abundances of 26 Al and 41 Ca can be matched with by models with initial masses ≥40 M ⊙ , while 36 Cl is matched only by our most massive models, ≥60 M ⊙ . 60 Fe is not significantly produced by any wind model, as required by the observations. Therefore, massive star winds are a favored candidate for the origin of the very short-lived 26 Al, 36 Cl, and 41 Ca in the ESS. 
    more » « less
  5. Abstract The Intergovernmental Science–Policy Platform on Biodiversity and Ecosystem Services has called for assessments explicitly accounting for interregional flows of ecosystem services (ESs) across geographic scales. An important type of interregional ES flow is generated by the long‐distance movements of migratory species. Many migratory species provide important benefits to people, and due to migration dynamics, ESs provided in one location may be affected by habitat conservation, or lack thereof, in other locations. The state of the science on interregional flows of ESs from migratory species, however, is nascent and lacks structure needed to consistently characterize flows. We developed a 4‐tiered system for categorizing assessments and the conclusions they can support based on 4 levels of ecological and socioeconomic information, ranging from incomplete to high, and how they are combined. The 4 tiers of assessment are based on differing levels of detail in the estimation of system‐level ecological and socioeconomic information on a species and the services it provides: telecoupled ESs, qualitative flows, quantitative static flows, and quantitative dynamic flows. Recent assessment studies largely fall within the first tier, which does not quantify flows. Socioeconomic and ecological information are needed to achieve each tier. Our framework can be used to identify and classify a range of methods, with varying time and data requirements, that can be used to maximize the information content and relevance of ES assessments for migratory species based on available resources. 
    more » « less