I'm currently doing a project in MATLAB using the MNIST data set. I have a training data set of n = 50000, represented by a matrix of 784 x 50000 (50000 column vectors of size 784).
I am trying to separate my training and testing data (70-30, respectively), but the method I am using is a bit wordy and brute force for my liking. Being that this is MATLAB, I'm sure there has got to be a better way. The code I have been using is listed below. I'm brand new to MATLAB so please help! Thanks :)
% MNIST - data loads into trn and trnAns, representing
% the input vectors and the desired output vectors, respectively
load('Data/mnistTrn.mat');
mnist_train = zeros(784, 35000);
mnist_train_ans = zeros(10, 35000);
mnist_test = zeros(784, 15000);
mnist_test_ans = zeros(10, 15000);
indexes = zeros(1,50000);
for i = 1:50000
indexes(i) = i;
end
indexes(randperm(length(indexes)));
for i = 1:50000
if i <= 35000
mnist_train (:,i) = trn(:,indexes(i));
mnist_train_ans(:,i) = trnAns(:,indexes(i));
else
mnist_test(:,i-35000) = trn(:,indexes(i));
mnist_test_ans(:,i-35000) = trnAns(:,indexes(i));
end
end
I hope this works:
% MNIST - data loads into trn and trnAns, representing
% the input vectors and the desired output vectors, respectively
load('Data/mnistTrn.mat');
% Generating a random permutation for both trn and trnAns:
perm = randperm(50000);
% Shuffling both trn and trnAns columns using a single random permutations:
trn = trn(:, perm);
trnAns = trnAns(:, perm);
mnist_train = trn(:, 1:35000);
mnist_train_ans = trnAns(:, 1:35000);
mnist_test = trn(:, 35001:50000);
mnist_test_ans = trnAns(:, 35001:50000);