Approximate Bayesian Inference
Variational Free Energy
I spent some time trying to figure out the derivation for the variational free energy, as expressed in some of Friston’s papers (see citations below).
While I made an intuitive justification, I just found this derivation (Kokkinos; see the reference and link below):
Other discussions about variational free energy:
Whereas maximum a posteriori methods optimize a point estimate of the parameters, in ensemble learning an ensemble is optimized, so that it approximates the entire posterior probability distribution over the parameters. The objective function that is optimized is a variational free energy (Feynman 1972) which measures the relative entropy between the approximating ensemble and the true distribution.
References for Statistical Thermodynamics, Approximate Bayes Inference, Variational Free Energy, and Kullback-Leibler (Methodologies for Deep Learning and Artificial Intelligence)
This is a collection of the resources that I’m using as part of an overall self-education in the above topics.
References for Variational Inference (Including Variational Free Energy)
- Jang, E. (2014). A Beginner’s Guide to Variational Methods: Mean-Field Approximation blogpost. NOTE: Added May 27, 2017, replacing the Kay Hendersen PPT deck, which is no longer available.
- Huang, G.T. Is this a unified theory of the brain? (2008). New Scientist (Print Edition) (28 May 2008). 4 pp. online access.
- Feynman, R.P. (1972). Statistical Mechanics. W.A. Benjamin, Inc.
- Friston, K.; Levin, M.; Sengupta, B.; Pezzulo, G. Knowing one’s place: a free-energy approach to pattern regulation. J. R. Soc. Interface 2015, 12, 20141383. doi:10.1098/rsif.2014.1383. pdf.
- Friston, K. Life as we know it. Journal of The Royal Society Interface 2013, 10. pdf.
- Kokkinos, I. Introduction to Deep Learning: Variational Influence, Mean Field Theory (a PPT presentation) PPT. AJM’s Comments: See pg. 22 for the variational free energy equation.
- MacKay, D.J.C. Ensemble learning for hidden Markov models. pdf
- Free energy: Criticisms and Conjectures. blogpost. AJM’s comments: This is an interesting blog discussion of Friston’s ideas on variational free energy. Worth a read and a re-read.
- Wo (2013). The lure of free energy, Wo’s Weblog. blogpost. AJM’s Comments: Another good discussion of Friston’s approach using variational free energy.
- This gives a perspective on how the KL divergence relates to gradient descent learning in NN / Deep Learning blogpost .
- Gal, Y. and Ghahramani, Z. On modern deep learning and variational influence (20xx). pdf. AJM’s comment: Looks useful. “… empirical developments in deep learning are often justified by metaphors, evading the unexplained principles at play. It is perhaps astonishing then that most modern deep learning models can be cast as performing approximate variational inference in a Bayesian setting. This mathematically
grounded result, studied in Gal and Ghahramani [1] for deep neural networks (NNs), is extended here to arbitrary deep learning models. The implications of this statement are profound…” - Fox, C. and Roberts, S. A tutorial on variational Bayesian influence. pdf. AJM’s comment: Looks like a useful tutorial. Revisit soon.
References for Statistical Thermodynamics
- M. Scott Shell, Thermodynamics and Statistical Mechanics: An Integrated Approach. Cambridge University Press, 2015, Chapter 16: The Canonical Partition Function Book Chapter, AJM’s Comments: This is a very good, readable, and intuitive introduction to statistical thermodynamics.
- M. Scott Shell, The relative entropy is fundamental to multiscale and inverse thermodynamic problems, THE JOURNAL OF CHEMICAL PHYSICS 129, 144108 2008 pdf. AJM’s Comments: This looks like a good paper; I stuck the reference here as part of a “to read later” stash.
7 thoughts on “Approximate Bayesian Inference”