Search code examples
pythonconv-neural-networkmxnet

How to process data for 3d convolutional neural network?


I have a collection of 11*11*21 3D data that I want to use a 3D convnet to classify. By using gluon's dataloader with a batch size of 64, my input tensor for the network was (64L, 11L, 11L, 21L). When I tried to run the program I got the following error.

"infer_shape error. Arguments:
data: (64L, 11L, 11L, 21L)"

I then realized that 3D converts take 5D tensors as inputs and thus I am stuck on how to create a 5D tensor input for the network.

If it helps here is the code I am currently using to create my data for the convnet.

train_dataset = mx.gluon.data.ArrayDataset((noA_list+A_list),     (label_noA+label_A))
test_dataset = mx.gluon.data.ArrayDataset((noA_test_list+A_list_test),(label_noA_test+label_A_test))
train_data = mx.gluon.data.DataLoader(train_dataset, batch_size= 64,shuffle= True, num_workers = cpucount)
test_data = mx.gluon.data.DataLoader(test_dataset,batch_size= 64,shuffle= True, num_workers = cpucount)

Solution

  • Yes, you would need 5-dimensional tensor for using Conv3d. By default, the tensor format should be NCDHW where:

    ‘N’ - batch size, ‘C’ - channel, ‘H’ - height ‘W’ - width ‘D’ - depth.

    Convolution is applied on the ‘D’, ‘H’ and ‘W’ dimensions.

    So, if you are missing channel dimension (and you are working with greyscale data), you can create this dimension:

    # a.shape is (64, 11, 11, 21)
    a = mx.nd.random.uniform(shape=(64, 11, 11, 21))
    # adding 'channel' dimension
    a.expand_dims(1)
    # new a.shape is (64, 1, 11, 11, 21)