Search code examples
databasealgorithmencodingcompressionbit

How to store a chess game efficiently?


What's the most space efficient way to store an entire chess game in a database? considering an avg game length of 50-70 moves (1 move means 2 player moves once), i hope it takes less than 800 bits.

The initial place of the piece could be stored in 6 bits and the move could take 2-4 bits. is there any way to store it less than that?


Solution

  • You could enumerate all legal moves in a given position, and sort them in an agreed way (like by starting square, then by target square, then by promotion piece -- if it concerns promotion, ...). Then identify each move by its index in this sorted list of legal moves.

    The (theoretical) position found with the most legal moves has 218 legal moves. We can be confident there is no position with more than 256 legal moves, so 8 bits are enough to encode a single move.

    With this encoding a game of 70 moves would need 560 bits.

    Dynamic number of bits

    It is however clear that in most positions the number of legal moves is more like 40 and not 218, so we could have the number of bits for the next encoded move depend dynamically on the number of legal moves that are available.

    Since the start position only has 20 moves, the first white move would be encoded with just 5 bits; the same for the first black move. As the game continues, there soon will be a position that has more than 32 legal moves, so 6 bits will be used. For instance, after

    1. e3 e5
    2. Qg4 d6
    

    ...white has 46 legal moves, so their move will be encoded using 6 bits. We already saved 3 x 4 = 12 bits (compared to 8 bits per move) for encoding these first four moves.

    While encoding, the algorithm will first check how many legal moves there are, and then use the corresponding number of bits to encode the next move. Also the decoder can know the number of legal moves and then consume that number of bits from the encoded stream.

    I don't know what would be a worst-case 70-move game in terms of number of needed bits (not feasible to compute in reasonable time), but it will be well below 500 bits.

    Minimizing for reasonable games

    The above derived limits are applicable to any game, even games where the players make the worst moves. If however you want to focus on reasonable games, you can further optimise the memory footprint:

    Use a deterministic chess engine that produces a score for every legal move, and use the score as a probability for that move actually being played. It is important that it will always produce the same score for the same move in the same game state. Use Huffman encoding to encode the move accordingly. This will mean that good moves will take fewer bits than bad moves. For instance, if the Queen can be captured, this could even become a move with just one bit. Stupid moves will need more bits than average, maybe even more than 8 bits (when there are many legal moves and they include very good and very bad moves). A 70-move game where almost all moves are bad could take up more bits than with the non-Huffman encoding algorithm, but reasonable games would use less.