Search code examples
phprandomweighted

How to randomize a PHP array of records, giving more weight to more recent items?


I have an array of records from a database (although the database is irrelevant to this question -- it eventually becomes an array of "rows", each row is an array with string keys corresponding to the field name). For example:

$items = array(
    1 => array('id' => 1, 'name' => 'John', 'created' => '2011-08-14 8:47:39'),
    2 => array('id' => 2, 'name' => 'Mike', 'created' => '2011-08-30 16:00:12'),
    3 => array('id' => 5, 'name' => 'Jane', 'created' => '2011-09-12 2:30:00'),
    4 => array('id' => 7, 'name' => 'Mary', 'created' => '2011-09-14 1:18:40'),
    5 => array('id' => 16, 'name' => 'Steve', 'created' => '2011-09-14 3:10:30'),
    //etc...
);

What I want to do is shuffle this array, but somehow give more "weight" to items with a more recent "created" timestamp. The randomness does not have to be perfect, and the exact weight does not really matter to me. In other words, if there's some fast and simple technique that kinda-sorta seems random to humans but isn't mathematically random, I'm okay with that. Also, if this is not easy to do with an "infinite continuum" of timestamps, it would be fine with me to assign each record to a day or a week, and just do the weighting based on which day or week they're in.

A relatively fast/efficient technique is preferable since this randomization will occur on every page load of a certain page in my website (but if it's not possible to do efficiently, I'm okay with running it periodically and caching the result).


Solution

  • After being partially inspired by the response from @Tadeck , I came up with a solution. It's kind of long-winded, if anyone could simplify it that would be great. But it seems to work just fine:

    //Determine lowest and highest timestamps
    $first_item = array_slice($items, 0, 1);
    $first_item = $first_item[0];
    $min_ts = strtotime($first_item['created']);
    $max_ts = strtotime($first_item['created']);
    foreach ($items as $item) {
        $ts = strtotime($item['created']);
        if ($ts < $min_ts) {
            $min_ts = $ts;
        }
        if ($ts > $max_ts) {
            $max_ts = $ts;
        }
    }
    
    //bring down the min/max to more reasonable numbers
    $min_rand = 0;
    $max_rand = $max_ts - $min_ts;
    
    //Create an array of weighted random numbers for each item's timestamp
    $weighted_randoms = array();
    foreach ($items as $key => $item) {
        $random_value = mt_rand($min_rand, $max_rand); //use mt_rand for a higher max value (plain old rand() maxes out at 32,767)
        $ts = strtotime($item['created']);
        $ts = $ts - $min_ts; //bring this down just like we did with $min_rand and $max_rand
        $random_value = $random_value + $ts;
        $weighted_randoms[$key] = $random_value;
    }
    
    //Sort by our weighted random value (the array value), with highest first.
    arsort($weighted_randoms, SORT_NUMERIC);
    
    $randomized_items = array();
    foreach ($weighted_randomsas $item_key => $val) {
        $randomized_items[$item_key] = $items[$item_key];
    }
    
    print_r($randomized_items);