I need help searching through a text file that contains thousands of lines for a specific series of text. Regex are just above me at this point, though I've read through many web pages now on how to implement them. This leaves me scratching my head and I know it's possible, it's a puzzle that I just can't seem to solve. I'm currently using a Javascript sandbox but that is absolutely not necessary if a better language tool exists elsewhere.
Specifically I'm looking for one specific chess game that I've played that involved 5 successive knight moves as white. The issue I see is that the format for the text is:
"... ##. N*** X*** ##. N*** X*** ..."
Where:
Again, this is just a puzzle but in my inexperience I am missing something, fundamentally or mechanically.
I have looked into the typical wildcard implementations that should make this easy, but my attempts combining a dozen different regular expressions to split the text is coming up really short and the search function itself stops at the first return that matches, which may not be the game I am searching for, so I need to continue searching through the entire document and return all instances.
Additional part of this question, I only need to return the position or line of this match to find it in the document and pull the game associated. Cursory searches give me let position = search.text(regex);
then just returning the position to console seems easy but may be inefficient.
Is there a better method for this?
I don't have a comprehensive list of all of my regex attempts, I'm sorry.
Using the below at https://regexkit.com/javascript-regex gets me matches but no groups. I'm not sure where my mistake is.
^.*[0-9]\d\.\sNx?[a-h].*[1-8]\s.*[A-Z]x?.*[a-h].*[1-8]\s.*[0-9]\d\.\sNx?.*[a-h].*[1-8]\s.*[A-Z]x?.*[a-h].*[1-8]\s$
Something like the above should work as a prototype to find two successive knight moves. Extrapolating that further is a matter of copy pasting but even this eludes me.
If you want an example text to test your regex solutions, see below. The first game below, has four successive knight moves, and the second game has two which should help with the problem solving.
[Event "Live Chess"] [Site "Chess.com"] [Date "2023.02.17"] [Round "-"] [White "-"] [Black "-"] [Result "1-0"] [WhiteElo "925"] [BlackElo "939"] [TimeControl "180+2"] [EndTime "17:39:13 PST"] [Termination "- won by checkmate"]
e4 e5 2. Nc3 d6 3. Bc4 h6 4. d3 Nc6 5. f4 Qe7 6. Nf3 Be6 7. f5 Bxc4 8. dxc4 O-O-O 9. b3 Nf6 10. O-O g5 11. fxg6 fxg6 12. Nh4 g5 13. Ng6 Qg7 14. Nxh8 Qxh8 15. Nd5 Nxd5 16. exd5 Nd4 17. c3 Nxb3 18. axb3 a6 19. b4 Be7 20. Qg4+ Kb8 21. b5 axb5 22. cxb5 b6 23. Qa4 Kc8 24. Qa8+ Kd7 25. Qc6+ Kc8 26. Ra8# 1-0
[Event "Live Chess"] [Site "Chess.com"] [Date "2023.02.17"] [Round "-"] [White "-"] [Black "-"] [Result "1-0"] [WhiteElo "914"] [BlackElo "841"] [TimeControl "180+2"] [EndTime "19:06:57 PST"] [Termination "- won by resignation"]
e4 e5 2. Nc3 Nc6 3. Bc4 Nf6 4. d3 Bc5 5. Na4 Bb4+ 6. c3 Be7 7. f4 b5 8. Bxb5 a6 9. Bc4 O-O 10. f5 d5 11. exd5 Nxd5 12. Nf3 Bxf5 13. O-O Bc5+ 14. Nxc5 Nf4 15. Nxe5 Nxe5 16. Rxf4 Bg4 17. Rxg4 Nxg4 18. Qxg4 Re8 19. Bh6 g6 1-0
Thank you all for your time and patience.
Edit: Bad copy/paste of the game text, missing "15.". Added "N" as a possible opponent move for clarity/completeness.
The OP's serial pseudo placeholder pattern of ... ##. N*** X*** ##. N*** X***
... expressed as (a) grouped regex pattern/s translates into following regex pseudo codes ...
(?: pattern )+
... for at least one (or more) matching non capturing group(s).(?: pattern ){2,}
... for a sequence of at least two (or more) matching non capturing groups.The minimum placeholder code of ##. N*** X***
together with the OP's description/requirements translates directly into either of the last two expressions ...
' ##. N* * * X * * *'
^ ^^^ ^ ^^ ^ ^ ^ ^ ^ ^ ^
/(?:\s+\d+\.\s+Nx?[a-h][1-8]\s+[RBKQN]?x?[a-h][1-8])+/g
/(?:\s+\d+\.\s+Nx?[a-h][1-8]\s+[RBKQN]?x?[a-h][1-8]){2,}/g
... where each of its description can be found at the pattern's test/playground page ...
/(?:\s+\d+\.\s+Nx?[a-h][1-8]\s+[RBKQN]?x?[a-h][1-8])+/g
/(?:\s+\d+\.\s+Nx?[a-h][1-8]\s+[RBKQN]?x?[a-h][1-8]){2,}/g
In case the OP wants to match any occurrence of a black night's move, the use case is as follows ...
console.log(
`[Event "Live Chess"] [Site "Chess.com"] [Date "2023.02.17"] [Round "-"] [White "-"] [Black "-"] [Result "1-0"] [WhiteElo "925"] [BlackElo "939"] [TimeControl "180+2"] [EndTime "17:39:13 PST"] [Termination "- won by checkmate"]
e4 e5 2. Nc3 d6 3. Bc4 h6 4. d3 Nc6 5. f4 Qe7 6. Nf3 Be6 7. f5 Bxc4 8. dxc4 O-O-O 9. b3 Nf6 10. O-O g5 11. fxg6 fxg6 12. Nh4 g5 13. Ng6 Qg7 14. Nxh8 Qxh8 15. Nd5 Nxd5 16. exd5 Nd4 17. c3 Nxb3 18. axb3 a6 19. b4 Be7 20. Qg4+ Kb8 21. b5 axb5 22. cxb5 b6 23. Qa4 Kc8 24. Qa8+ Kd7 25. Qc6+ Kc8 26. Ra8# 1-0
[Event "Live Chess"] [Site "Chess.com"] [Date "2023.02.17"] [Round "-"] [White "-"] [Black "-"] [Result "1-0"] [WhiteElo "914"] [BlackElo "841"] [TimeControl "180+2"] [EndTime "19:06:57 PST"] [Termination "- won by resignation"]
e4 e5 2. Nc3 Nc6 3. Bc4 Nf6 4. d3 Bc5 5. Na4 Bb4+ 6. c3 Be7 7. f4 b5 8. Bxb5 a6 9. Bc4 O-O 10. f5 d5 11. exd5 Nxd5 12. Nf3 Bxf5 13. O-O Bc5+ 14. Nxc5 Nf4 15. Nxe5 Nxe5 16. Rxf4 Bg4 17. Rxg4 Nxg4 18. Qxg4 Re8 19. Bh6 g6 1-0`
.match(/(?:\s+\d+\.\s+Nx?[a-h][1-8]\s+[RBKQN]?x?[a-h][1-8])+/g)
.map(result => result.trim())
)
In case the OP needs the indices of any matching black night's move, the above use case changes to ...
console.log([
...`[Event "Live Chess"] [Site "Chess.com"] [Date "2023.02.17"] [Round "-"] [White "-"] [Black "-"] [Result "1-0"] [WhiteElo "925"] [BlackElo "939"] [TimeControl "180+2"] [EndTime "17:39:13 PST"] [Termination "- won by checkmate"]
e4 e5 2. Nc3 d6 3. Bc4 h6 4. d3 Nc6 5. f4 Qe7 6. Nf3 Be6 7. f5 Bxc4 8. dxc4 O-O-O 9. b3 Nf6 10. O-O g5 11. fxg6 fxg6 12. Nh4 g5 13. Ng6 Qg7 14. Nxh8 Qxh8 15. Nd5 Nxd5 16. exd5 Nd4 17. c3 Nxb3 18. axb3 a6 19. b4 Be7 20. Qg4+ Kb8 21. b5 axb5 22. cxb5 b6 23. Qa4 Kc8 24. Qa8+ Kd7 25. Qc6+ Kc8 26. Ra8# 1-0
[Event "Live Chess"] [Site "Chess.com"] [Date "2023.02.17"] [Round "-"] [White "-"] [Black "-"] [Result "1-0"] [WhiteElo "914"] [BlackElo "841"] [TimeControl "180+2"] [EndTime "19:06:57 PST"] [Termination "- won by resignation"]
e4 e5 2. Nc3 Nc6 3. Bc4 Nf6 4. d3 Bc5 5. Na4 Bb4+ 6. c3 Be7 7. f4 b5 8. Bxb5 a6 9. Bc4 O-O 10. f5 d5 11. exd5 Nxd5 12. Nf3 Bxf5 13. O-O Bc5+ 14. Nxc5 Nf4 15. Nxe5 Nxe5 16. Rxf4 Bg4 17. Rxg4 Nxg4 18. Qxg4 Re8 19. Bh6 g6 1-0`
.matchAll(/(?:\s+\d+\.\s+Nx?[a-h][1-8]\s+[RBKQN]?x?[a-h][1-8])+/g)]
.map(result => result.index)
);
And in case the OP just wants to know the matches/indices of only consecutive black night moves, both of the above use cases change to ...
const regXConsecutiveBlackNightMoves =
/(?:\s+\d+\.\s+Nx?[a-h][1-8]\s+[RBKQN]?x?[a-h][1-8]){2,}/g;
const sampleText =
`[Event "Live Chess"] [Site "Chess.com"] [Date "2023.02.17"] [Round "-"] [White "-"] [Black "-"] [Result "1-0"] [WhiteElo "925"] [BlackElo "939"] [TimeControl "180+2"] [EndTime "17:39:13 PST"] [Termination "- won by checkmate"]
e4 e5 2. Nc3 d6 3. Bc4 h6 4. d3 Nc6 5. f4 Qe7 6. Nf3 Be6 7. f5 Bxc4 8. dxc4 O-O-O 9. b3 Nf6 10. O-O g5 11. fxg6 fxg6 12. Nh4 g5 13. Ng6 Qg7 14. Nxh8 Qxh8 15. Nd5 Nxd5 16. exd5 Nd4 17. c3 Nxb3 18. axb3 a6 19. b4 Be7 20. Qg4+ Kb8 21. b5 axb5 22. cxb5 b6 23. Qa4 Kc8 24. Qa8+ Kd7 25. Qc6+ Kc8 26. Ra8# 1-0
[Event "Live Chess"] [Site "Chess.com"] [Date "2023.02.17"] [Round "-"] [White "-"] [Black "-"] [Result "1-0"] [WhiteElo "914"] [BlackElo "841"] [TimeControl "180+2"] [EndTime "19:06:57 PST"] [Termination "- won by resignation"]
e4 e5 2. Nc3 Nc6 3. Bc4 Nf6 4. d3 Bc5 5. Na4 Bb4+ 6. c3 Be7 7. f4 b5 8. Bxb5 a6 9. Bc4 O-O 10. f5 d5 11. exd5 Nxd5 12. Nf3 Bxf5 13. O-O Bc5+ 14. Nxc5 Nf4 15. Nxe5 Nxe5 16. Rxf4 Bg4 17. Rxg4 Nxg4 18. Qxg4 Re8 19. Bh6 g6 1-0`;
console.log(
sampleText
.match(regXConsecutiveBlackNightMoves)
.map(result => result.trim())
)
console.log([
...sampleText
.matchAll(regXConsecutiveBlackNightMoves)
]
.map(result => result.index)
);
There are two additional variants of the base pattern which, for a multiline search, would match the game either entirely or partly while capturing the first occurring consecutive black night move.