Search code examples
matlabmatrixneural-networkdimensionstraining-data

Rearrange array into a form suitable for NN training


I am working on a large dataset that I need to convert to a specific format for further processing. I am looking for advice in this regard.

Sample input:

A = [0.99  -0.99
     1     -1
     0.55  -0.55]

Sample output:

val(:,:,1,1)=0.99
val(:,:,2,1)=-0.99
val(:,:,1,2)=1
val(:,:,2,2)=-1
val(:,:,1,3)=0.55
val(:,:,2,3)=-0.55

While working on this, I found a code inside the CNN toolbox of MATLAB R2018b

function dummifiedOut = dummify(categoricalIn)
    % iDummify   Convert a categorical input into a dummified output.
    %
    % dummifiedOut(1,1,i,j)=1 if observation j is in class i, and zero
    % otherwise. Therefore, dummifiedOut will be of size [1, 1, K, N],
    % where K is the number of categories and N is the number of
    % observation in categoricalIn.

    %   Copyright 2015-2016 The MathWorks, Inc.

    numObservations = numel(categoricalIn);
    numCategories = numel(categories(categoricalIn));
    dummifiedSize = [1, 1, numCategories, numObservations];
    dummifiedOut = zeros(dummifiedSize);
    categoricalIn = iMakeHorizontal( categoricalIn );
    idx = sub2ind(dummifiedSize(3:4), int32(categoricalIn), 1:numObservations);
    dummifiedOut(idx) = 1;
end
function vec = iMakeHorizontal( vec )
    vec = reshape( vec, 1, numel( vec ) );
end

Can we modify this block of code in such a way to produce the sample output?


Solution

  • Either do what rinkert suggested, or just use permute directly:

    >> val = permute(A, [4,3,2,1])
    val(:,:,1,1) =
        0.9900
    val(:,:,2,1) =
       -0.9900
    val(:,:,1,2) =
         1
    val(:,:,2,2) =
        -1
    val(:,:,1,3) =
        0.5500
    val(:,:,2,3) =
       -0.5500 
    

    Note that the function which you posted requires categorical data, whereas you have a simple double array. If you insist on "adapting" the existing dummify, you could do:

    function dummifiedOut = dummify(categoricalIn)
        dummifiedOut = zeros([1,1,size(categoricalIn)]);
        dummifiedOut(:) = categoricalIn;
    end
    

    (...although, IMHO, this makes little sense.)