python machine-learning deep-learning chess python-chess

Why are logical shifts used when representing a chess board for a Deep Learning task?

Recently I came across a twitch streamer who was working on his own Deep Learning based chess engine. I was going through the code I saw in the video and one thing I didn't quite understand was why he used logical shifts when he was preparing the input data (i.e. the chess board representations) for training. Here are the rough steps that he followed:

He fetched a dataset with chess games in "pgn" format
For each move in each game there is a new board state that occurs. Each of these new states is serialized in the following way:

He creates a 8x8 matrix that represents the 8x8 board after this specific move
The matrix is supposed to store 8 bit unsigned integers
He places all chess pieces on the board (i.e. in the matrix)
The white pieces are defined as follows: {"P": 1, "N": 2, "B": 3, "R": 4, "Q": 5, "K": 6}
The black pieces are defined as: {"p": 9, "n": 10, "b": 11, "r": 12, "q": 13, "k": 14}
This means for instance that white pawns are stored as "1" in the matrix, whereas black queen will be stored as "13"

After serializing the board he generates the final board state from the original 8x8 matrix by executing some logical bit operations that I don't quite understand. Also the newly generated (i.e. final board state) is not 8x8 but 5x8x8:


 # Init new board state
 final_boardstate = np.zeros((5, 8, 8), np.uint8)

 # old_boardstate is the initial 8x8 matrix containing uint8 values
   
 # Bit operations that I don't understant
 final_boardstate[0] = (old_boardstate>> 3) & 1
 final_boardstate[1] = (old_boardstate>> 2) & 1
 final_boardstate[2] = (old_boardstate >> 1) & 1
 final_boardstate[3] = (old_boardstate >> 0) & 1

I was wondering can anyone help me understand some of the logic behind these operations? As far as I understand, he wants to create 5 different 8x8 board representations each based on different logical shift (3,2,1 and 0 bit logical left shift). However, I am not completely sure that this assumption is correct and I don't really know what is the reasoning behind running these operations in the context of chess board representations.

Solution

These are the pieces in binary: P: 0001 N: 0010 B: 0011 R: 0100 Q: 0101 K: 0110 p: 1001 n: 1010 b: 1011 r: 1100 q: 1101 k: 1110

You can see that the left bit of all black pieces is always one and the left bit of the white pieces is always zero. That's why 7 and 8 have been skipped. With

(old_boardstate>> 3) & 1

The color indicating bit is shifted all the way to the right. The & 1 removes everything else that is not the wanted bit. So this expression returns a 1 if the color of the piece is Black, otherwise it returns a 0. The three other bits indicate the piece type independent of the color. The bit operations that you don't understand are used to get the individual bits out of the 8-bit integer to store them in the numpy array. The numpy array is the input for the neural network and has the 5x8x8 dimensions because five input neurons are used to represent each field of the board.