Set.seed issue with sample when changing order of values

set.seed(59)
mean(sample(c(12,7,5),7,prob = c(.3,.3,.4),replace = T))
[1] 9.571429}

set.seed(59)
mean(sample(c(5,7,12),7,prob = c(.4,.3,.3),replace = T))
[1] 8.142857

Shouldn't both codes return the same sample mean, why is it different?

Solution

Well, first consider the simplier case where you leave off the prob=

set.seed(59) 
sample(c(12,7,5),7,replace = T)
# [1]  5 12 12  5  5 12  5
set.seed(59) 
sample(c(5,7,12),7,replace = T)
# [1] 12  5  5 12 12  5 12

Because you have different input, you get a different result. But also note that the sample function is really sampling from the vector indexes, not the actual values of the vector. See how in the second result, you've basically just swapped the 5s and the 12s. The only thing that matters is the length of the input vector. If you try it with

set.seed(59) 
sample(1:3,7,replace = T)
# [1] 3 1 1 3 3 1 3

See how you still get he same "accaaca" pattern (the middle value is never picked). That's what setting the seed will do for you. You really only get the exact same result of all other parameters are identical.

If you change the order of the values in the vector, and swap the probabilities, you won't get the same observations from a pseudorandom number generator like the one R uses. It's simply not "smart" enough to see those are the same statistical distribution. However, if you draw a bunch of sample over and over again, in the long run they will have similar means thanks to the law of large numbers.