Skip to content

mlpapers/bayesian-inference

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

Bayesian Inference

Remember that using Bayes' Theorem doesn't make you a Bayesian. Quantifying uncertainty with probability makes you a Bayesian. (Michael Betancourt)

Overview

  • Books
    • Free book on Bayesian Inference written in Jupyter Notebooks - Bayesian Methods for Hackers Cam Davidson-Pilon (the main author)
    • Bayesian Data Analysis (2020) Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, Donald B. Rubin
  • Bayesian Workflow (2020) Andrew Gelman, Aki Vehtari, Daniel Simpson, Charles C. Margossian, Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian Bürkner, Martin Modrák

Markov chain Monte Carlo (MCMC)

Stochastic Gradient MCMC (SG-MCMC)

  • SGLD Stochastic Gradient Langevin Dynamics
    • Bayesian Learning via Stochastic Gradient Langevin Dynamics (2011) Max Welling, Yee Whye Teh — Shows that adding calibrated noise to stochastic gradient descent produces asymptotically exact posterior samples, enabling Bayesian inference to scale to large datasets for the first time without full-batch MCMC.
  • SGHMC Stochastic Gradient Hamiltonian Monte Carlo
    • Stochastic Gradient Hamiltonian Monte Carlo (2014) Tianqi Chen, Emily B. Fox, Carlos Guestrin — Extends SGLD by incorporating momentum (as in HMC above), adding a friction term to correct for gradient noise and improving mixing over the random-walk behavior of SGLD.
  • A Complete Recipe for Stochastic Gradient MCMC (2015) Yi-An Ma, Tianqi Chen, Emily B. Fox — Provides a unifying framework showing that SGLD, SGHMC, and other SG-MCMC variants are all special cases of continuous Markov processes parameterized by two matrices, and introduces new samplers like SGRHMC within this framework.

Sequential Monte Carlo (SMC)

  • Sequential Monte Carlo Methods in Practice (2001) Arnaud Doucet, Nando de Freitas, Neil Gordon (Editors) — The foundational reference on particle filters: propagates weighted samples through a sequence of distributions, enabling online inference in state-space models where MCMC would require re-running from scratch.
  • Sequential Monte Carlo Samplers (2006) Pierre Del Moral, Arnaud Doucet, Ajay Jasra — Generalizes SMC beyond filtering to sample from arbitrary sequences of static distributions, making it applicable to Bayesian model comparison and tempered posteriors — the key theoretical bridge between particle filters and general Bayesian computation.
  • An Introduction to Sequential Monte Carlo (2020) Nicolas Chopin, Omiros Papaspiliopoulos — Modern textbook treatment covering both the theory (Feynman-Kac formalism) and practice of SMC, including waste-free SMC and connections to tempering strategies used in modern samplers.

Approximate Bayesian Computation (ABC)

Variational Inference (VI)

Normalizing Flows for Inference

  • Variational Inference with Normalizing Flows (2015) Danilo Rezende, Shakir Mohamed — Introduces the idea of transforming a simple variational posterior through a chain of invertible mappings, breaking free of the mean-field assumption that limits standard VI and enabling arbitrarily complex approximate posteriors.
  • Normalizing Flows for Probabilistic Modeling and Inference (2021) George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, Balaji Lakshminarayanan — Definitive review of flow architectures (coupling, autoregressive, residual), their expressive power, and applications spanning density estimation, variational inference, and simulation-based inference.
  • Model-Informed Flows for Bayesian Inference (2025) Joohwan Ko, Justin Domke — Proves that Variationally Inferred Parameters (VIP) can be represented exactly as autoregressive flows augmented with the model's prior, then exploits this connection to design Model-Informed Flows that deliver tighter posteriors for hierarchical Bayesian models.

Expectation Propagation (EP)

  • Expectation Propagation for Approximate Bayesian Inference (2001) Thomas P. Minka — Proposes a deterministic alternative to MCMC and VI that iteratively refines local likelihood approximations by moment matching, unifying assumed-density filtering and loopy belief propagation; often more accurate than the Laplace approximation (below) and variational Bayes at comparable cost.
  • Expectation Propagation as a Way of Life (2020) Aki Vehtari, Andrew Gelman, Tuomas Sivula, Pasi Jylänki, Dustin Tran, Swupnil Sahai, Paul Blomstedt, John P. Cunningham, David Schiminovich, Christian Robert — Reframes EP as a framework for distributed Bayesian inference: data partitions communicate through iteratively refined approximate likelihoods, enabling parallelism while preserving information sharing — addressing scalability limits of both standard EP and MCMC.

Laplace Approximation

  • A Practical Bayesian Framework for Backpropagation Networks (1992) David J.C. MacKay — Pioneering work applying a second-order Taylor expansion (Laplace approximation) around the MAP estimate to approximate the posterior over neural network weights, enabling model comparison via the Bayesian evidence — the simplest deterministic approach to Bayesian neural networks.
  • Laplace Redux — Effortless Bayesian Deep Learning (2021) Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig — Revives MacKay's Laplace approach for modern deep networks with scalable Kronecker-factored and last-layer approximations; shows it is competitive with MC Dropout and ensembles (see Bayesian Deep Learning below) at a fraction of the cost, and provides the laplace-torch library.

Simulation-Based Inference (SBI)

  • The Frontier of Simulation-Based Inference (2020) Kyle Cranmer, Johann Brehmer, Gilles Louppe — Landmark review of the shift from classical ABC methods (above) to neural network-based likelihood-free inference; surveys how neural density estimators, classifiers, and ratio estimators replace the rejection/tolerance mechanisms of ABC with learned surrogates.
  • NPE Neural Posterior Estimation
  • NLE Neural Likelihood Estimation
    • Sequential Neural Likelihood (2019) George Papamakarios, David Sterratt, Iain Murray — Instead of learning the posterior directly (as NPE does), learns a neural surrogate of the likelihood using autoregressive flows, then plugs it into standard MCMC — more robust to model misspecification and composable with different priors without retraining.
  • NRE Neural Ratio Estimation
    • Approximating Likelihood Ratios with Calibrated Discriminative Classifiers (2015) Kyle Cranmer, Juan Pavez, Gilles Louppe — Trains a classifier to distinguish parameter-data pairs, whose output directly estimates the likelihood ratio — avoids density estimation entirely, requiring only a binary classification objective, and is well-suited to hypothesis testing.
    • Benchmarking Simulation-Based Inference (2021) Jan-Matthis Lueckmann, Jan Boelts, David Greenberg, Pedro Goncalves, Jakob Macke — Systematic comparison of NPE, NLE, NRE and classical ABC on standardized tasks; finds that neural methods consistently outperform ABC but no single algorithm dominates, and that sequential variants improve sample efficiency.

Diffusion Models for Posterior Sampling

  • Score-Based Generative Modeling through Stochastic Differential Equations (2021) Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole — Unifies score matching and diffusion models as continuous-time SDEs that gradually corrupt data into noise and reverse the process via learned score functions; provides the theoretical foundation for using diffusion models as priors in Bayesian inverse problems.
  • Diffusion Posterior Sampling for General Noisy Inverse Problems (2023) Hyungjin Chung, Jeongsol Kim, Michael T. McCann, Marc L. Klasky, Jong Chul Ye — Combines a pretrained diffusion prior (from above) with a measurement likelihood to sample from the Bayesian posterior for inverse problems, using manifold-constrained gradients to handle both linear and nonlinear forward models with noise.
  • Score-based diffusion models for diffuse optical tomography with uncertainty quantification (2026) Fabian Schneider, Meghdoot Mozumder, Konstantin Tamarov, Leila Taghizadeh, Tanja Tarvainen, Tapio Helin, Duc-Lam Duong — Applies the diffusion posterior sampling framework to medical imaging, introducing a regularization strategy that blends learned and model-based scores to prevent overfitting; demonstrates calibrated uncertainty estimates with lower variance than classical Bayesian methods.

Bayesian Deep Learning

  • Weight Uncertainty in Neural Networks (2015) Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, Daan Wierstra — Introduces "Bayes by Backprop": maintains a variational distribution over each weight (rather than a point estimate), optimizing the variational free energy with reparameterized gradients — the first practical VI method (see VI above) for modern deep networks.
  • Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (2016) Yarin Gal, Zoubin Ghahramani — Reinterprets standard dropout training as approximate inference in a deep Gaussian process, enabling uncertainty estimates from any existing dropout network at test time with zero additional cost — far cheaper than Bayes by Backprop but less flexible.
  • Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (2017) Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell — Proposes training multiple networks with random initialization as a non-Bayesian alternative for uncertainty; despite its simplicity, deep ensembles empirically match or outperform both MC Dropout and Bayes by Backprop on calibration and out-of-distribution detection.
  • Bayesian Deep Learning and a Probabilistic Perspective of Generalization (2020) Andrew Gordon Wilson, Pavel Izmailov — Argues that deep ensembles succeed precisely because they approximate Bayesian model averaging, proposes MultiSWAG for cheaper within-basin marginalization, and shows that Bayesian averaging resolves pathologies like double descent.
  • Bayesian Computation in Deep Learning (2025) Wenlong Chen, Bolian Li, Ruqi Zhang, Yingzhen Li — Recent review organizing the Bayesian deep learning toolbox around two computational pillars: SG-MCMC (see above) and VI, covering their challenges (multimodality, cold posteriors) and solutions specific to deep neural networks and deep generative models.
  • See also: Bayesian Neural Networks in Neural Networks

Gaussian processes

Uncertainty calibration

Software

Related Topics