I'm trying to write a regular expression to validate moves in algebraic chess notation. Here's what I have so far:
/
O-O(-O)? # Castling
|[KQRBN]x?[a-h][1-8]\+? # Most normal moves and captures
/
Where I am lost is pawn promotion.
A knight, bishop, or centre pawn may only promote on the file from which it starts or the file on either side via a capture. A rook pawn may promote by moving straight or capturing to one side, depending on whether it is on the a- or h- files. So something like
/[a-h](x[a-h])?[18]=[QRBN]\+?/
doesn't work, because fxh8 is not a valid move (only fxe8 and fxg8 are). I could go the long route with
/(a(xb)?|h(xg)?|b(x[ac])?.../ # insert 5 more files in place of the ...
but that's not very efficient. I want to use grouping, so that I can do rook-pawns, and everything else. I have something like this in mind:
/([b-g])(x(\1±1))?/
To indicate "the letters b to g may be followed by the letter that comes before or after them".
Matching the adjacent letter is not hard using alternations, but you won't find anything compact as there is no character arithmetic in regex.
Someone just posted a similar question out of sympathy for your plight.
The long way is the only way.