Search code examples
matlabrandomsamplecell-array

Randomly Select sample from a cell array in MATLAB


I have a cell array in MATLAB as follow, the first column is a list of user ID:

A = { 'U2', 'T13', 'A52';  
      'U2', 'T15', 'A52';  
      'U2', 'T18', 'A52';  
      'U2', 'T17', 'A995'; 
      'U4', 'T18', 'A53';  
      'U4', 'T13', 'A64';  
      'U4', 'T18', 'A64';
      ....
     }

I also have a cell array B contains the unique ID for user as follow:

B = {'U2', 'U4'}

My goal is try to randomly select two samples for each user. Assume each user at least have two samples in B.

One example is the C as follow:

C = { 'U2', 'T13', 'A52';  
      'U2', 'T18', 'A52';   
      'U4', 'T13', 'A64';  
      'U4', 'T18', 'A64';
        ...
     }

How to generate those sample?


Solution

  • The following code should produce what you are looking for:

    A = {
      'U2', 'T13', 'A52';  
      'U2', 'T15', 'A52';  
      'U2', 'T18', 'A52';  
      'U2', 'T17', 'A995'; 
      'U4', 'T18', 'A53';  
      'U4', 'T13', 'A64';  
      'U4', 'T18', 'A64';
      'U7', 'T14', 'A44';  
      'U7', 'T14', 'A27';  
      'U7', 'T18', 'A27';  
      'U7', 'T13', 'A341';  
      'U7', 'T11', 'A111';
      'U8', 'T17', 'A39';  
      'U8', 'T15', 'A58'
    };
    
    % Find the unique user identifiers...
    B = unique(A(:,1));
    B_len = numel(B);
    
    % Preallocate a cell array to store the results...
    R = cell(B_len*2,size(A,2));
    R_off = 1;
    
    % Iterate over the unique user identifiers...
    for i = 1:B_len
    
        % Pick all the entries of A belonging to the current user identifier...
        D = A(ismember(A(:,1),B(i)),:);
    
        % Pick two random non-repeating entries and add them to the results...
        idx = datasample(1:size(D,1),2,'Replace',false);
        R([R_off (R_off+1)],:) = D(idx,:); 
    
        % Properly increase the offset to the results array...
        R_off = R_off + 2;
    
    end
    

    Here is one of the possible outcomes for the code snippet above:

    >> disp(R)
    
        'U2'    'T13'    'A52' 
        'U2'    'T18'    'A52' 
        'U4'    'T13'    'A64' 
        'U4'    'T18'    'A64' 
        'U7'    'T14'    'A44' 
        'U7'    'T13'    'A341'
        'U8'    'T17'    'A39' 
        'U8'    'T15'    'A58' 
    

    For more information about the functions I used, refer to the following pages of the official Matlab documentation: