Search code examples
ludwig

How does Ludwig encode images


I’m looking to understand how Ludwig encodes images. Does it run the images through a pretrained model without running a loss or does it run a loss? If so, what type of loss is ran for a large feature set?


Solution

  • All the documentation regarding image pre-processing and encoding can be found here.

    In summary:

    Currently there are two encoders supported for images: Convolutional Stack Encoder and ResNet encoder which can be set by setting encoder parameter to stacked_cnn or resnet in the input feature dictionary in the model definition (stacked_cnn is the default one).

    Each of the above encoders have configurable hyper-parameters. It does not run them through a pre-trained model as that would defeat the purpose - you are training your own model with your own data.