Search code examples
sqlsql-servert-sqlrandompagination

Pseudo Random Repeatable Sort in SQL Server (not NEWID() and not RAND())


I would like to randomly sort a result in a repeatable fashion for purposes such as paging. For this NEWID() is too random in that the same results cannot be re-obtained. Order by Rand(seed) would be ideal as with the same seed the same random collection would result. Unfortunately, the Rand() state resets with every row, does anyone have a solution?

declare @seed as int;
set @seed = 1000;

create table temp (
id int,
date datetime)

insert into temp (id, date) values (1,'20090119')
insert into temp (id, date) values (2,'20090118')
insert into temp (id, date) values (3,'20090117')
insert into temp (id, date) values (4,'20090116')
insert into temp (id, date) values (5,'20090115')
insert into temp (id, date) values (6,'20090114')

-- re-seeds for every item
select *, RAND(), RAND(id+@seed) as r from temp order by r
--1 2009-01-19 00:00:00.000 0.277720118060575   0.732224964471124
--2 2009-01-18 00:00:00.000 0.277720118060575   0.732243597442382
--3 2009-01-17 00:00:00.000 0.277720118060575   0.73226223041364
--4 2009-01-16 00:00:00.000 0.277720118060575   0.732280863384898
--5 2009-01-15 00:00:00.000 0.277720118060575   0.732299496356156
--6 2009-01-14 00:00:00.000 0.277720118060575   0.732318129327415
-- Note how the last column is +=~0.00002

drop table temp

-- interestingly this works:
select RAND(@seed), RAND()
--0.732206331499865 0.306382810665955

Note, I tried Rand(ID) but that just turns out to be sorted. Apparently Rand(n) < Rand(n+1)


Solution

  • Building off of gkrogers hash suggestion this works great. Any thoughts on performance?

    declare @seed as int;
    set @seed = 10;
    
    create table temp (
    id int,
    date datetime)
    
    insert into temp (id, date) values (1,'20090119')
    insert into temp (id, date) values (2,'20090118')
    insert into temp (id, date) values (3,'20090117')
    insert into temp (id, date) values (4,'20090116')
    insert into temp (id, date) values (5,'20090115')
    insert into temp (id, date) values (6,'20090114')
    
    -- re-seeds for every item
    select *, HASHBYTES('md5',cast(id+@seed as varchar)) r
    from temp order by r
    --1 2009-01-19 00:00:00.000 0x6512BD43D9CAA6E02C990B0A82652DCA
    --5 2009-01-15 00:00:00.000 0x9BF31C7FF062936A96D3C8BD1F8F2FF3
    --4 2009-01-16 00:00:00.000 0xAAB3238922BCC25A6F606EB525FFDC56
    --2 2009-01-18 00:00:00.000 0xC20AD4D76FE97759AA27A0C99BFF6710
    --3 2009-01-17 00:00:00.000 0xC51CE410C124A10E0DB5E4B97FC2AF39
    --6 2009-01-14 00:00:00.000 0xC74D97B01EAE257E44AA9D5BADE97BAF
    
    drop table temp
    

    EDIT: Note, the declaration of @seed as it's use in the query could be replace with a parameter or with a constant int if dynamic SQL is used. (declaration of @int in a TSQL fashion is not necessary)