I have a question about reward in RL. is this sentence true? and if it is why? thank you in advance
"the reward each time (for the same action from the same state) needs not to be the same."
For deterministic perfect information game, it's true. Think of games like Go or Chess. But for other games, the reward of the same state and action mainly depends on the current internal state of the game.