Search code examples
pythonarraysnumpyindexingboolean-indexing

Python effect of boolean values as index (a[a==0] = 1)


I am currently implementing some code I have seen on github.

(https://gist.github.com/karpathy/a4166c7fe253700972fcbc77e4ea32c5)

The point of interest here is the following:

def prepro(I):
   """ prepro 210x160x3 uint8 frame into 6400 (80x80) 1D 
   float vector """
   I = I[35:195] # crop
   I = I[::2,::2,0] # downsample by factor of 2
   I[I == 144] = 0 # erase background (background type 1)
   I[I == 109] = 0 # erase background (background type 2)
   I[I != 0] = 1 # everything else (paddles, ball) just set to 1
   return I.astype(np.float).ravel()

The author is preprocessing an image here in order to train a neural network. The part I am confused about is:

I[I == 144] = 0 # erase background (background type 1)
I[I == 109] = 0 # erase background (background type 2)
I[I != 0] = 1 # everything else (paddles, ball) just set

I think the author wants to set all elements in the list which have the value 144 (109, not 0) to a specific value. But if I am correct, a Boolean just represents 0 or 1 in python. therefore comparing a list with an integer will always result in False and therefore 0.

This makes I[I==x] <=> I[0] : x is integer so why even bother to do this?

What am I missing here?


Solution

  • NumPy arrays are a bit different; their use resembles the use in MATLAB.

    I == 144 produces a logical array with the same dimensions as I, where all positions which are 144 in I are true, all others false.

    (The same holds for the other expressions.)

    Using such a logical array for indexing means that all positions where the index is true will be affected by the assignment.