Search code examples
computer-visionneural-networkdeep-learningvideo-processingcaffe

Can someone explain how to use Conv3d and ConvND in caffe?


Can someone please, explain how one can use a Conv3D or a ConvND for Depth-images or videos or pretty much any 3d (n-d?) data in Caffe ?
Is there any example or demo for Conv3D ?


Solution

  • You can use the regular "Convolution" layer to process blobs of any dimension. You only need to pay close attention to the parameters:

    layer {
      type: "Convolution"
      name: "conv_nd"
      bottom: "in" # 5D blob 
      too: "out"
      convolution_param {
         kernel_size: 3
         kernel_size: 5
         kernel_size: 5 # define 3 by 5 by 5 kernel
    
         pad: 1
         pad: 2
         pad: 2  # pad according to kernel size
    
         stride: 1
         stride: 2
         stride: 2 # you can have different stride for different dimensions
    
         axis: 1  # the "channel" dimension
         num_output: 30 # output 30 dim per 3D voxel
      }
    }
    

    For more information read the comments on Convolution param in the caffe.proto file.