Search code examples
phpmysqlrandom

MySQL RAND() how often can it be used? does it use /dev/random?


I have a table with few rows (tops 50), I need to get random value out of table I can do that by
ORDER BY RAND() LIMIT 1
Main question is in the point when I have 6k selects in 5 seconds is rand stil 'reliable'? How is rand calculated, can I seed it over time? (idk, every 5 seconds).


Solution

  • The MySQL pseudo-random number generator is completely deterministic. The docs say:

    RAND() is not meant to be a perfect random generator. It is a fast way to generate random numbers on demand that is portable between platforms for the same MySQL version.

    It can't use /dev/random because MySQL is designed to work on a variety of operating systems, some of which don't have a /dev/random.

    MySQL initializes a default seed at server startup, using the integer returned by time(0). If you're interested in the source line, it's in the MySQL source in file sql/mysqld.cc, function init_server_components(). I don't think it ever re-seeds itself.

    Then the subsequent "random" numbers are based solely on the seed. See source file mysys_ssl/my_rnd.cc, function my_rnd().


    The best practice solution to your random-selection task, for both performance and quality of randomization, is to generate a random value between the minimum primary key value and maximum primary key value. Then use that random value to select a primary key in your table:

    SELECT ... FROM MyTable WHERE id > $random LIMIT 1
    

    The reason you'd use > instead of = is that you might have gaps in the id due to rows being deleted or rolled back, or you might have other conditions in your WHERE clause so that you have gaps in between rows that match your conditions.

    The disadvantages of this greater-than method:

    • Rows following such a gap have a higher chance of being chosen, and the larger the gap the greater the chance.
    • You need to know the MIN(id) and MAX(id) before you generate the random value.
    • Doesn't work as well if you need more than one random row.

    Advantages of this method:

    • It's much faster than ORDER BY RAND(), even for a modest table size.
    • You can use a random function outside of SQL.