Neural Networks and Python Code: Be Careful with the Array Indices!
Our Special Topics class on Deep Learning (Northwestern University, Master of Science in Predictive Analytics program, Winter, 2017) starts off with very basic neural networks: the backpropagation learning method applied to the classic X-OR problem. I’m writing Python code to go with this class, and the result by the end of the quarter should be five-to-six solid pieces of code, involving either the backpropagation or Boltzmann machine learning algorithm, with various network configurations.
The following figure shows the dependence of the connection weights on the Summed Squared Error, used in backpropagating weight changes. The illustration is for the specific input case of (0,1), so only the inputs from the second input node (Input1) count towards activating either of the two hidden nodes. (The input from the first input node is 0 in this case, so it doesn’t communicate any activation signal.)
The goal for this week’s code was not just to create a neural network for the X-OR problem, but to use very basic, simple, readable Python code – especially readable by a Python novice, or someone who is needing a serious refresher. Thus, even though I’m putting data into lists and arrays, and using those lists and arrays to move the data into and out of various functions and procedures, I’m avoiding using Python’s matrix operations. Instead, I’m packing and unpacking the array each time I need it. Tedious and cumbersome, but much more traceable by someone who is new to (or very rusty with) Python.
Naturally, there was an interesting debug challenge. I’d misinterpreted how the neural network weight connections needed to be stored in a Python array. In essence, the indices are mirror-image reversed across the diagonal. For example, if we have a set of two input nodes and two hidden nodes, we want to compute the summed weighted input into each of the two hidden nodes, H0 and H1. There will be four input-to-hidden connection weights; two going into H0 and two going into H1.
The following figure shows how the connection weights are represented in Python arrays.
5 thoughts on “Neural Networks and Python Code: Be Careful with the Array Indices!”
Good article and very appropriate warning for the subject at hand.
However this also may be seen as an excellent case of the shortcomings of Python with an eye toward it being a procedural language. Indeed utilizing multiple arrays vs structures that could encapsulate information is tendency that all programmers, especially those of us with a mathematic or computational science background, can fall into. We tend to think of algorithms with a slant toward our old favorite – FORTRAN and accordingly design our code along the strengths and weakness of that language.
In the cited case, use of objects to bind weights to data has obvious advantages, once a structure is designed. Admittedly, designing such objects in a way that efficiently represent our data can be time-consuming. And sometimes the data refuses to conform. In that case it might be worth considering a restructuring to see if another form is acceptable – frequently such changes bring new insights into the problem as well.
As a former die-hard FORTRAN and C coder I have found myself wanting to fall back on arrays and almost to a certainty when re-examining the code found that other data types yielded code that was more succinct, easier to debug, and in many cases faster ( due to reduced lines of code and compiler optimizations ).
I’ve found other advantages to OO coding languages. My own work involve hyper-realistic combat simulation. As such there is a great deal of code dedicated towards explosions. The literature on the topic is extensive, but a great deal of the derived equations use mixed units ( e.g. GPa of pressure generates PSI of pressure at so many feet away. ) Rather than re-derive the formula for pure MKS units or seeding the code with conversion constants I eventually designed a series of object classes that can accept or display data in multiple unit systems. Then my code can use the published formula directly while maintaining a MKS-based environment overall. Additionally the lines became self-documenting , greatly aiding the users wishing to inspect my work.
Admittedly there are times when arrays have to be used. But I think those cases are much rarer than one would initially believe.
This is a beautiful observation, Gregg, and I totally agree with you. Given our druthers, OO is definitely superior to the predecessor style of structured code.
Let me offer an example in support of what you’ve just said. Years ago, developing the CORTECON I (a neural network that uses free energy minimization), my student researcher / coder started off writing code in regular C. A few months into the project, he gave me the bad news. We were going to have to put the “research” on hold while he totally rebuilt the code from an OO perspective, in C++. It was a tough nugget to swallow, but I agreed. We had a month-long time-out on research progress (by this time, the code was pretty huge). The resulting code was much more robust and could be more readily expanded.
I’ve found the same thing myself, recently. I began my CORTECON II work in straightforward, structured Python. Lots of arrays; no objects. As we move into the real meat of CORTECON II capabilities, I see that an OO approach will be essential.
In the case of writing code for students, though, I’m not sure that going OO – right away – is the best approach. This is simply because we have students for whom writing regular Python code is still a big challenge. They’re smart, they’re inquisitive, they’re aggressive – but only 20% (on the average) come in with some confidence in their Python skills. Another 20% are absolute Python newbies, and the remaining 60% are either just starting or in need of a serious refresher.
We’re switching to more of a Python (and R) emphasis in Northwestern University’s MSPA program. That means that more students entering this course will have a stronger Python background.
So, your point is well-heeded. I’m seeing a switch from regular structured code to OO coming up … perhaps I can keep both versions of code active, for students who find the jump to OO programming, in the midst of learning NN algorithms, just a bit too much.
But very good heads-up, and I agree with you. Now, it’s just finding the time.
I agree that OO coding can be a challenge to the novice who is struggling just to learn programming languages at all. You do have to cater to the abilities of the students.
But there are good reasons for having those students at least begin to master OO coding. Foremost is the design process. Time and time again I’ve seen projects where the designer took the approach of building simply with plans to bolt on the needed complexity later. A typical example would be designing a simulation first for single-user user and expecting to add code for multiplayer later. The coder enviably ends up chasing rabbits all over the project, finding special case after special case – making a mess out of his/her initial “clean” design. If instead the effort had been taken to design with multiplayer in mind from the beginning most of those annoying special cases collapse into a far smaller set or disappear entirely.
So yes, the novice is going to moan about having to learn OO techniques. Just like young coders always moan about having to write documentation. It’s a pain and in their limited perspective it does not bring any utility to their efforts. Yet after enough exposure they will begin to appreciate the power and elegance of these techniques – how it can simplify the code by vast amounts while rendering that same code far more legible.
In closing, shouldn’t students taking such an advanced class know or be willing to learn OO coding? Procedural techniques certainly have their place. But OO is becoming more and more of a dominant feature in the research and commerce job spaces, best to begin mastering now vs. later.
Gregg, AJ, a very interesting discussion. Thanks for sharing.
Thanks, Gregg! Glad you found this useful!