Search code examples
matlabstatisticsstatistics-bootstrap

How to see resampled data after BOOTSTRAP


I was trying to resample (with replacement) my database using 'bootstrap' in Matlab as follows:

D = load('Data.txt');
lead = D(:,1);
depth = D(:,2);
X = D(:,3);
Y = D(:,4);

%Bootstraping to resample 100 times
[resampling100,bootsam] = bootstrp(100,'corr',lead,depth);

%plottig the bootstraping result as histogram
hist(resampling100,10);
... ... ...
... ... ...

Though the script written above is correct, I wonder how I would be able to see/load the resampled 100 datasets created through bootstrap? 'bootsam(:)' display the indices of the data/values selected for the bootstrap samples, but not the new sample values!! Isn't it funny that I'm creating fake data from my original data and I can't even see what is created behind the scene?!?

My second question: is it possible to resample the whole matrix (in this case, D) altogether without using any function? However, I know how to create random values from a vector data using 'unidrnd'.

Thanks in advance for your help.


Solution

  • The answer to question 1 is that bootsam provides the indices of the resampled data. Specifically, the nth column of bootsam provides the indices of the nth resampled dataset. In your case, to obtain the nth resampled dataset you would use:

    lead_resample_n = lead(bootsam(:, n));
    depth_resample_n = depth(bootsam(:, n));
    

    Regarding the second question, I'm guessing what you mean is, how would you just get a re-sampled dataset without worrying about applying a function to the resampled data. Personally, I would use randi, but in this situation, it is irrelevant whether you use randi or unidrnd. An example follows that assumes 4 columns of some data matrix D (as in your question):

    %# Build an example dataset
    T = 10;
    D = randn(T, 4);
    
    %# Obtain a set of random indices, ie indices of draws with replacement
    Ind = randi(T, T, 1);
    
    %# Obtain the resampled data
    DResampled = D(Ind, :);
    

    To create multiple re-sampled data, you can simply loop over the creation of random indices. Or you could do it in one step by creating a matrix of random indices and using that to index D. With careful use of reshape and permute you can turn this into a T*4*M array, where indexing m = 1, ..., M along the third dimension yields the mth resampled dataset. Example code follows:

    %# Build an example dataset
    T = 10;
    M = 3;
    D = randn(T, 4);
    
    %# Obtain a set of random indices, ie indices of draws with replacement
    Ind = randi(T, T, M);
    
    %# Obtain the resampled data
    DResampled = permute(reshape(D(Ind, :)', 4, T, []), [2 1 3]);