Search code examples
pythonsortingrandom

How is result of 'random' function connected with list index?


This is very simple piece of code from one of educational course.

import random
l=[1,2,3,4,5,6,7,8,9,10]
sorted(l, key=lambda x:random.random())

As a result we have shuffled list in a random order. For instance:

[7, 4, 9, 1, 10, 8, 6, 3, 2, 5]

I printed generated random values for this list. They were:

0.2991455199706915
0.6909636166874692
0.6588993793087247
0.2464197124358396
0.8021604268319649
0.4982523946322083
0.14000601629016884
0.3598504556263765
0.25865161068212017
0.3534562884598361

My question is how one of these values connected with index in the list? For instance, why value of x = 0.2991455199706915 means 7th element of the list and value of x = 0.6909636166874692 means 4th element of the list?

If it had been done in C++, for instance, I would have scaled the values generated random function from interval [0,1) to the interval [0,10) and obviously round generated values to the nearest integer. It looks like I have similar process here under the hood. I would like to know how it works.


Solution

  • I believe the "evidence" you have ("x = 0.29914... means 7th element of the list", etc.) is not correct.

    If you change this so you save the generated sort order key in a dict and then peek at those exact values, you'll see that it's doing exactly what you'd expect – Python calls your get-key function once for each item of the list, and the result list is sorted by that value.

    import random
    
    keys = {}
    
    
    def get_key(x):
        print("Getting a key for", x)
        keys[x] = key = random.random()
        return key
    
    
    print(sorted([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], key=get_key))
    print(keys)
    print(sorted((sort_key, original_number) for (original_number, sort_key) in keys.items()))
    

    This prints out (e.g. – it's random!, and I took the liberty of shortening the numbers a bit)

    Getting a key for 1
    Getting a key for 2
    [...]
    Getting a key for 10
    [4, 2, 5, 10, 8, 6, 1, 9, 3, 7]
    {1: 0.91042, 2: 0.12800, 3: 0.97425, 4: 0.04509, 5: 0.27770, 6: 0.73670, 7: 0.98525, 8: 0.42114, 9: 0.95332, 10: 0.41352}
    [(0.04509, 4), (0.12800, 2), (0.27770, 5), (0.41352, 10), (0.42114, 8), (0.73670, 6), (0.91042, 1), (0.95332, 9), (0.97425, 3), (0.98525, 7)]
    

    so as you can see, when the keys dict is sorted by value, you get the same 4,2,5,10,... order.

    However, this is a bad way to shuffle a list – one should simply use random.shuffle() (on a copy of the list, if you don't want to mess with the original), which implements Fisher-Yates shuffle.