Search code examples
machine-learningartificial-intelligencereinforcement-learning

Stationarity conecpt in Sequential decision in reinforcement learning


Below is text snippet from Sequential decision problem in Artifical Intellegence book A modern approach by Stuart Russel and Peter Norvig. Chater 17 section 17.1

Stationarity for preferences means the following:

if two state sequences [s0, s1, s2, . . .] and [s0',s1', s2', . . .] begin with the same state (i.e., s0 =s01), then the two sequences should be preference-ordered the same way as the sequences [s1, s2, . . .] and [s1', s2', . . .].

In English, this means that if you prefer one future to another starting tomorrow, then you should still prefer that future if it were to start today instead.

I am difficulty in understanding last statement.

In English, this means that if you prefer one future to another starting tomorrow, then you should still prefer that future if it were to start today instead.

Kindly eloboarte and explain.


Solution

  • Another definition from Wikipedia about stationarity that may help to understand the idea:

    In mathematics and statistics, a stationary process is a stochastic process whose unconditional joint probability distribution does not change when shifted in time.

    The key concept is that does not change when shifted in time. So, applied to the case of preferences, the preference should be the same independently of the time in which is made. I.e., the preference for day 3 should be the same if you are in day 2 (tomorrow) or day 1 (today).