Compact Way of Storing Chess State Information

chessboardcompactnesscomputer scienceinformation theory

I'm looking for a method or a solution to store the chess state information in the most compact way possible without losing information and without too much surplus in computers. According to Deedlit there about 7.7*10^45 possible legal chess positions. That means i want to be as close as possible to be only using 160 bits (cant go lower due to byte alignment).
My first approach was bitboards, 64 bits * 4 since all square states can be represent by 3 bits or 8 states from empty to king the extra 64 bits is for color. That makes it 256 bits + 6 (en passant) + 4 (castle rights both color). Thats too far from the goal, so i started pruning useless bits.

For the 3 bits for pieces, we dont actually require the other 2 states as there can only be 7 square states plus there can only 1 king for both color so bitboard is quite inefficient. So I decided to compact it by storing all pieces except king and pawn into the second and third bit. Next, i used the first bit to represent pawn and empty, and if its set, the second bit will identify whether its pawn or empty. Regardless, the third bit is free, that means we have 64 * 3 – 16 (free pawn bits at max) – 32 (free empty bits at max) = 144 for the positions alone. Now we have to work with colors, there are 32 pieces at the board at max so 32 bits will be used for color, so 144 + 32. Ohh we almost for got the two kings. There are only 64 possible places the king could be so thats 2^6 or 6 bits + 2 for both sides, thats 8 for 1 king. If we stack both king together we wouldn't need color bits, so thats 16. Well, we just consumed all the free bits. But wait, there's still en passant, so thats 6 for position. So, thats 196 bits in total.

Thats how far I could go, and that is why I decided to ask for help. My goal from this problem is just have a way to store chess state compact enough that i could create an AI that covers them all and have no room for hallucinations, but not too compact that it looses information that any AI architecture could not understand.

Best Answer

I believe you can do a decent amount better by taking advantage of the fact that "normal" positions only have 2 knights/bishops/rooks/queens (and for them to have more, there must be pawn promotions and each promotion requires deleting at least 1 pawns and one piece/pawn. As such, you can gain a lot by using a huffman encoding of piece positions where empty is 1 bit (since at least half of spaces must be empty), pawns are 3 bits, and pieces are 5 bits. On the 1st and 8th row, you can delete the possibility of pawns and store pieces with 4 bits. The final encoding is:

First encode king position and castling rights in 12 bits. There are 3612 possible positions to put the 2 kings such that they don't attack each other which leaves enough room for 58*6+1 states to encode can't castle right/can't castle left/can't castle for each color. Next encode en passant in 4 bits (every board has a maximum of 14 possible en passants possible so store 0 for invalid, or 1 to 14 for which of the up to 14 is active). Technically you could save a bit since there can only be more than 7 en passants when pieces have been captured, so you can store 3 bits or 4 bits for the count depending on whether there have been captures, but that sounds annoying. Then 6 bits for 50 move rule Last encode the rest of the board as follows (encoding kings as empty)

2nd through 7th rank:

empty: 0

pawn: 10*

knight: 1100*

bishop: 1101*

rook: 1110*

queen: 1111*

1st, 8th rank

empty: 0

knight: 100*

bishop: 101*

rook: 110*

queen: 111*

The worst case scenario here is no pieces captured and all pieces off the back rank, which takes 316 bits for the pawsn, 414 bits for the pieces and 32 bits for empty squares.

This totals 158 bits which fits under your 160 bit cap.