Search code examples
arraysmatlabshufflepsychtoolbox

Shuffle an array of unequally repeating entries, so they do not repeat


I am writing an experiment in Matlab Psychtoolbox and my conditions are stored in an array like this

Cond = ["VM" "VM" "VN" "VS" "AM" "AM" "AN" "AS" "CM" "CM" "CN" "CS"...
"VM" "VM" "VN" "VS" "AM" "AM" "AN" "AS" "CM" "CM" "CN" "CS"];

I now want to shuffle the array in a way that I don't have repeating conditions.

There are a lot of treats regarding this problem e.g.this one, but they always have every condition equally often.

Some suggested a brute force method, shuffling so often until this criteria is reached. But since I have tree of this condition arrays and I have to shuffle them several times per experimental run I don't think that is a good solution.

Hope someone can help


Solution

  • I devised an algorithm that should do what you're asking for. Given a sequence, it will randomly reorder it such that no value repeats. However, it does appear to have a tendency to create repeated sub-sequences (e.g. ..."A" "B" "C" "A" "B" "C"...). I wrapped it in a function reorder_no_reps:

    function seq = reorder_no_reps(seq)
    
      % Find unique values and counts:
      N = numel(seq);
      [vals, ~, index] = unique(seq(:), 'stable');
      counts = accumarray(index, 1);
      [maxCount, maxIndex] = max(counts);
    
      % Check the maximum number of occurrences:
      if (2*maxCount-1 > N)
        error('Can''t produce sequence without repeats!');
      end
    
      % Fill cell array column-wise with permuted and replicated set of values:
      C = cell(maxCount, ceil(N/maxCount));
      if (3*maxCount-1 > N)
        permIndex = [maxIndex(1) ...
                     setdiff(randperm(numel(vals)), maxIndex(1), 'stable')];
      else
        permIndex = randperm(numel(vals));
      end
      C(1:N) = num2cell(repelem(vals(permIndex), counts(permIndex)));
    
      % Transpose cell array and extract non-empty entries:
      C = C.';
      seq = reshape([C{~cellfun('isempty', C)}], size(seq));
    
    end
    

    A description of the steps for the algorithm:

    • Find the unique values in the input sequence (vals) and how many times they each appear (counts).
    • Find the maximum occurrence of a single value (maxCounts) and check if it is too large to make a sequence without repreats.
    • A random permutation order is applied to both the unique values and their counts. If the maximum occurrence exceeds a given threshold, the corresponding value index is moved to the beginning of the randomized order (the reason why is addressed in the next bullet).
    • Each unique value is then replicated in place by its number of occurrences in the sequence. A cell array is filled in column-wise order with these replicated values, possibly leaving some cells at the end empty. Since the cell array has a number of rows equal to the maximum occurrence of a value, no single row of the resulting cell array will have a value appearing more than once in it. In addition, the last value in each row should be different than the first value in the subsequent row (ensured by filling first with the most frequently occurring value in certain cases).
    • Transpose the cell array, then pull all the non-empty cell values in column-wise order and reshape them into the same size as the input sequence. This is the same as pulling values row-wise from the non-transposed cell array, which should give you a sequence without repeated values.