A “First Principles” Approach to Artificial General Intelligence
What We Need to Take the Next Tiny, Incremental Little Step:
The “next big thing” is likely to be the next small thing – a tiny step, an incremental shift in perspective. However, a perspective shift is all that we need in order to make some real advances towards general artificial intelligence (GAI).
In the second chapter of the ongoing book , I share the following figure (and sorry, the chapter itself is not released yet):
Now, we’ve actually been doing a pretty good job of grabbing hold of these major strategies and using them as guiding principles in neural networks development:
- Deep learning is the notion of multiple representation levels piled higher and deeper,
- Convolutional neural networks combines the benefits of two strategies; multiple representation levels and 2-D topographic maps, and
- Energy-based learning methods combines the benefits of multiple representation levels with free energy minimization in order to achieve desired learning, or connection weight strengths.
This isn’t bad.
What’s missing, though, is a broader use of free energy minimization, as is shown in the following figure.
Now, I wouldn’t be making such a big deal out of this if it were not likely that the brain does extensive free energy minimization. It seems to be doing this a lot, in a way that suggests that this is more than just a learning process – it’s a regular operation kind of thing. (For those of you who want to dig a bit deeper; I wrote a paper in late 2016 that did a pretty decent summary of what we know, or believe that we know, about free energy minimization in the brain; here’s a link to The Cluster Variation Method: A Primer for Neuroscientists.)
The logical question, then, is: if free energy minimization is such a pervasive process, why haven’t we grabbed onto it sooner? Why have we been using it in such a limited fashion, if the brain is using it all the time?
The answer might be in what we use for a free energy function. That is, if we’re going to do free energy minimization, then we need a function that gets minimized.
So far, the free energy function that most of us know and love is just a wee bit limited.
Now, life circumstances constrain me to make today’s post short.
Next week, we’ll pick up with this theme.
We’ll start with the classic (Ising) free energy expression, and take it apart, and look at the various components. In particular, we’ll pay attention to the entropy, since that’s the most complex element of the free energy. (It’s not that horridly complex, once we start graphing it out.)
We’ll use this widely-known free energy function to play another Gedankenexperiment (German for “thought-experiment”), and we’ll ask ourselves: what would happen if we used this particular free energy expression as the thing that we were minimizing?
The answer will be (spoiler alert!): not nearly as much as we’d like.
This is why people have largely not been pursuing free energy minimization in a way that would make sense, given how important it is as a natural process – and how likely important it is in the brain.
There’s something that we can do, though. We can look at the free energy expression that we’ve got and say: this doesn’t really tell the full story. We can use a similar (albeit more complex) function, and maybe get something a bit more interesting to happen.
That, my friend, is an overview of the path that we’ll be taking through the woods.
At the end – and it may be very premature to suggest this – but at the end, we would like to have something that:
- Makes sense – in terms of describing (or modeling) a system that we think it also makes sense to be modeling,
- Behaves well in public – mathematically, whatever we come up with needs to play nicely, and
- Offers hope – in the sense that if we stick this free energy equation into a larger system, and as part of its job, it gets minimized on a frequent and regular basis (sounds a little BDSM-ish, doesn’t it?), then it makes the whole system act in a useful way – that we get something that both gives us a greater range of behaviors, but still well-behaved behaviors. Maybe something that we can even understand.
Until next week –
Live free or die, my friend –
AJ Maren
Live free or die: Death is not the worst of evils.
Attr. to Gen. John Stark, American Revolutionary War
Some Useful Background Reading on Statistical Mechanics
- Hermann, C. Statistical Physics – Including Applications to Condensed Matter, in Course Materials for Chemistry 480B – Physical Chemistry (New York: Springer Science+Business Media), 2005. pdf. Very well-written, however, for someone who is NOT a physicist or physical chemist, the approach may be too obscure.
- Maren, A.J. Statistical Thermodynamics: Basic Theory and Equations, THM TR2013-001(ajm) (Dec., 2013) Statistical Thermodynamics: Basic Theory and Equations.
- Salzman, R. Notes on Statistical Thermodynamics – Partition Functions, in Course Materials for Chemistry 480B – Physical Chemistry, 2004. Statistical Mechanics (chapter). Online book chapter. This is one of the best online resources for statistical mechanics; I’ve found it to be very useful and lucid.
- Tong, D. Chapter 1: Fundamentals of Statistical Mechanics, in Lectures on Statistical Physics (University of Cambridge Part II Mathematical Tripos), Preprint (2011). pdf.
Previous Related Posts
- A Hidden Layer Guiding Principle: What We Minimally Need
- How Getting to a Free Energy Bottom Helps Us Get to the Top
- What’s Next for AI: Beyond Deep Learning
- Third Stage Boost – Part 2: Implications of Neuromorphic Computing
- Third Stage Boost: Statistical Mechanics and Neuromorphic Computing
- Neg-Log-Sum-Exponent-Neg-Energy – That’s the Easy Part!
- 2025 and Beyond
- Deep Learning: The Fast Evolution of Artificial Intelligence
- Statistical Mechanics of Machine Learning blogpost – the great “St. Crispin’s Day” introduction of statistical mechanics and machine learning.
- Brain-based Computing: Foundation for Deep Learning.
10 thoughts on “A “First Principles” Approach to Artificial General Intelligence”