Rapid inference and comparison of gravitational-wave population models with neural variational posteriors
Authors
Matthew Mould, Noah E. Wolfe, Salvatore Vitale
Abstract
The LIGO-Virgo-KAGRA catalog has been analyzed with an abundance of different population models due to theoretical uncertainty in the formation of gravitational-wave sources. To expedite model exploration, we introduce an efficient and accurate variational Bayesian approach that learns the population posterior with a normalizing flow and serves as a drop-in replacement for existing samplers. With hardware acceleration, inference takes just seconds for the current set of black-hole mergers and readily scales to larger catalogs. The trained posteriors provide an arbitrary number of independent samples with exact probability densities, unlike established stochastic sampling algorithms, while requiring up to three orders of magnitude fewer likelihood evaluations and as few as $\mathcal{O}(10^3)$. Provided the posterior support is covered, discrepancies can be addressed with smoothed importance sampling, which quantifies a goodness-of-fit metric for the variational approximation while also estimating the evidence for Bayesian model selection. Neural variational inference thus enables interactive development, analysis, and comparison of population models, making it a useful tool for astrophysical interpretation of current and future gravitational-wave observations.
Concepts
The Big Picture
Imagine you’re an astronomer trying to understand how black holes form across the universe. Your detector, a gravitational-wave observatory stretching kilometers across, has just recorded dozens of black hole collisions. Each one carries a faint imprint of the physics that created it: the masses, the spins, the distances. But reading that imprint means fitting a model to the entire population of events at once. And fitting that model can take hours, every time you try a slightly different theoretical assumption.
The LIGO-Virgo-KAGRA (LVK) collaboration has detected hundreds of black hole and neutron star collisions. Theorists have proposed a zoo of explanations, each requiring a fresh, expensive run through the statistical machinery. Exploring this space of theories is like taste-testing every dish at a buffet when each plate takes an hour to prepare.
Researchers at MIT’s LIGO Laboratory have now built a neural network shortcut that cuts this analysis time from hours to seconds without sacrificing accuracy.
Key Insight: By replacing traditional trial-and-error computer algorithms with a neural network trained to directly learn the answer, the team achieves complete probability maps in seconds on a GPU, requiring up to 1,000 times fewer model calculations than existing methods.
How It Works
The problem is Bayesian population inference: computing a posterior distribution, a map showing which combinations of model parameters are most probable given all observed data. Astronomers normally use algorithms like nested sampling or Markov Chain Monte Carlo (MCMC), which explore parameter space by taking millions of random steps, evaluating how well the model fits the data at each one. Reliable, but slow. These methods need hundreds of thousands to millions of likelihood evaluations, each one asking “how probable is this data if the parameters are λ?”
The new approach, variational inference, reframes the problem. Instead of sampling the posterior, you learn it. A normalizing flow (a neural network that can represent virtually any shape of probability distribution) learns to reshape a simple bell curve through a sequence of mathematical transformations. Training minimizes the Kullback-Leibler (KL) divergence, a standard measure of how different two probability distributions are.
The training loop:
- Draw samples from the current flow approximation
- Evaluate the likelihood and prior for those samples
- Compute the KL divergence loss
- Update the flow parameters to reduce the loss
- Repeat, for only ~10³ to 10⁴ total likelihood evaluations (orders of magnitude fewer than nested sampling)
Once trained, the flow gives you an exact probability density. You can draw as many independent samples as you want, instantly. Traditional samplers produce correlated chains; this produces truly independent draws.
What if the flow misses part of the posterior? The team backs it up with Pareto smoothed importance sampling (PSIS). After training, samples from the flow are reweighted by the ratio of the true posterior to the flow’s approximation, correcting for any gaps in coverage.
A fitted Pareto shape parameter k̂ works as a built-in diagnostic: k̂ < 0.7 means the approximation is reliable, while higher values signal that the flow needs more training or flexibility. No need to run an independent sampler for comparison.
PSIS also yields the Bayesian evidence, the integral quantifying how well a model fits the data overall. With evidence values in hand, researchers can compute Bayes factors to compare competing population models head-to-head. Traditional variational methods struggle with evidence estimation, but the PSIS approach provides unbiased estimates when the flow covers the posterior support.
The team validated their method against established nested sampling on real LVK black hole merger data. For the current catalog, the GPU-accelerated flow trains in seconds and recovers posteriors that match nested sampling closely. They also stress-tested on a mock catalog of 1,599 events, comparable to what future LVK observing runs will produce. The neural variational approach handled it in minutes. Nested sampling would have taken hours.
Why It Matters
This speed gain opens up new science. Population inference for gravitational waves drives some of the field’s biggest questions. Are black hole mergers predominantly formed in isolated binary stars or dense stellar clusters? What is the maximum black hole mass from stellar collapse? How do spin orientations reveal the dynamical history of binary systems?
Each question spawns dozens of competing models. Answering them properly means running every model on the same data and comparing results. When each fit takes hours, thorough exploration is impractical. When each fit takes seconds, it becomes interactive. Researchers can iterate on model assumptions in real time, probe edge cases, and test physical hypotheses the way a programmer iterates on code: test, tweak, test again.
As LVK accumulates detections through future observing runs, and as next-generation detectors like Einstein Telescope and Cosmic Explorer promise catalogs of tens of thousands of events, this kind of scalability matters more and more. The method’s computation time scales with the cost of a single likelihood evaluation, not with catalog size, so it won’t bottleneck on larger datasets.
Bottom Line: Neural variational posteriors with normalizing flows cut gravitational-wave population analysis from hours to seconds, unlock fast Bayesian model comparison, and scale naturally to the massive catalogs that next-generation detectors will deliver.
IAIFI Research Highlights
This work connects modern deep generative modeling (normalizing flows from machine learning) with Bayesian gravitational-wave astrophysics, showing that neural methods can work as drop-in replacements for established statistical samplers without loss of rigor.
Variational inference with normalizing flows, combined with Pareto smoothed importance sampling as a principled diagnostic, matches gold-standard stochastic samplers while requiring up to three orders of magnitude fewer model evaluations.
Faster population inference directly speeds up the astrophysical interpretation of LVK data, allowing more thorough exploration of black hole formation models and more careful Bayesian model comparison across the growing catalog of gravitational-wave events.
As gravitational-wave catalogs grow toward thousands of events with future observing runs and next-generation detectors, this approach will become increasingly valuable; code is publicly available and the work appears as [arXiv:2504.07197](https://arxiv.org/abs/2504.07197).