Summed-Squared-Error (SSE): Two New YouTubes
When students run neural networks code, one of the first things that gets their attention is the summed-squared-error (SSE) – at the beginning of the run (before training), and then again after training has been done for several thousands of iterations.
There are usually three basic questions that people ask about SSE, with regard to a particular neural network:
- I’ve got an SSE value taken before my neural network starts training. Does this value make sense?
- My neural network has been training for a while now, and I have a new SSE. Is this SSE a good indicator of a trained neural net? And finally,
- What sort of SSE value should I use as a criterion to stop training?
This quarter, it just sort of jumped out at me. Students in my deep learning course (MSDS 458: Artificial Intelligence and Deep Learning, in Northwestern University’s Master of Science in Data Science Program), were reporting their SSEs both before and after training their neural network.
But I was just feeling that they weren’t quite getting it. It seemed that they didn’t yet have an intuition about their SSEs – were the SSE values good? Bad? Indifferent? Maybe?
So this week, I cut these two vids.
The first is a general overview, and a walk-through with a very simple example. The second applies the SSE to another (still simple) example, the X-OR neural network, with a bit more detail than the first.
If you’re new to neural networks, and trying to get a solid handle on the basics, check them out and see if they work for you!
YouTube Vid 1
Summed-Squared-Error (SSE): Assessing SSE for Neural Networks Training (Part 1)
First in a two-part vid series, showing how the summed-squared-error (SSE) is computed for a simple neural network (a Multilayer Perceptron).
YouTube Vid 2
Summed-Squared-Error (SSE): Neural Networks Back-Propagation X-OR Problem
Second in a two-part vid series, showing the summed-squared-error (SSE) computed for the X-OR neural network, both before and after training with a stochastic gradient descent algorithm.
Major Goofus: Can You Find It?
So, just before pushing the upload button on the second vid, I realized that I had a major goofus.
I was tired. Grumpy. And I just wanted to get that thing done. (You know the feeling, right?)
So I pushed the upload button – because I knew that with over twenty “deep learning” students reviewing these vids over the next few days, they would certainly spot the goofus that I knew about, and probably several others as well. (Students are really good at goofus-spotting.)
So here’s a hint: it’s notation.
I swear, my next vid is going to be a rant – on how delicate, tricky, and just mind-numbing it can be to deal with notation. One of the greatest “gotchas” in the field.
And if you don’t do it right, it bites you. Right in the soft and tender spots.
So … this weekend … I’m going to have to edit and re-upload those two vids – and probably at least one of the two vids that I cut before these – and maybe one or more chapters in the book.
But – can you spot the goofus before I announce it and make the fix? (Or fixes plural, as the case is likely to be.)
If you’re not currently enrolled in one of my classes, I can’t give you bonus points. (Generous bonus points are my current offer to students for goofus-spotting.) BUT – I can certainly respond to you in the Comments below, and you can get fame and glory amongst your peers by being first-to-announce the great, obvious error.
Have fun with it!
Leave comments.
And thank you!
Live free or die, my friend –
AJ Maren
Live free or die: Death is not the worst of evils.
Attr. to Gen. John Stark, American Revolutionary War