How many states could I work with on my ordinary home computer when I want to implement a reinforcement learning algorithm such as Q-Learning? 1 thousand, 1 million, more?
It is highly unadvisable to run a lot of states. The reason is really simple - when there are a lot of states in the memory, by the time the GPU finds the state and its corresponding action, the game already changes to another state.
So the solution is to use something a bit more advanced than naive Q-learning. See Deep Q-learning and other popular variants of RL like A3C. They help to avoid this issue