I was following How to build your own AlphaZero AI using Python and Keras
The git is here In run.ipynb, this part of the code:
memory.clear_stmemory()
if len(memory.ltmemory) >= config.MEMORY_SIZE:
The post didn't explain much on it.
What are memory.ltmemory
and memory.stmemory
used for?
If you haven't realized by now, ltmemory
stands for long term memory, and stmemory
stands for short term memory. I haven't yet had a long look at the github code, but I do have a basic understanding of how AlphaZero and reinforcement learning come together (being a chess enthusiast myself).
Basically, what makes AlphaZero so strong is that it uses both a long and a short term memory, much like we humans do. By being able to make decisions off of time-local data (i.e. events that have recently happened) and more global data (i.e. the entire game and its outcome), AlphaZero is able to make decisions that will not only benefit it in the short term, but will also benefit it in the long term.
Does this make sense or at all answer your question? I sort of typed this quickly and gave a fairly high level description of what was going on. Leave a question comment if there is one part that you want me to go into more detail on.