Search code examples
tensorflowmachine-learningtensorflow-litec-apiimage-classification

Resizing the input dimension in TensorFlow Lite C API when using a model created with make_image_classifier


Apologies if this questions looks familiar, I had posted a more broader description of the problem earlier but I have since deleted it as I have made some progress in my investigation and can narrow down to more specific questions.

Context:

  • I am creating an image classification model using make_image_classifier.
  • I want to use the C API to load the produced model and label images. I am encountering data input issues here.
  • I can label images with the label_image.py example so the model is fine and the problem is with my use of the C API.
  • If I understand make_image_classifier correctly, it produces a model that expects a 4 dimension input. We are dealing with images so beyond width, height, and channels I don't know what this 4th dimension is. This lack of understanding may be the source of my problem.
  • I included some error handling in my code and the error I am encountering occurs when attempting to copy from the input buffer after a resize.

Questions:

Q1: Why does the model produced by make_image_classifier expect a 4 dimension input? There is height, width and channel but what's the 4th one?

When I do the following with the C API to run the model with my image input:

int inputDims[3] = {224, 224, 3};
tflStatus = TfLiteInterpreterResizeInputTensor(interpreter, 0, inputDims, 3);

I get:

ERROR: tensorflow/lite/kernels/conv.cc:329 input->dims->size != 4 (3 != 4)
ERROR: Node number 2 (CONV_2D) failed to prepare. 

So I end up doing:

int inputDims[4] = {1, 224, 224, 3};
tflStatus = TfLiteInterpreterResizeInputTensor(interpreter, 0, inputDims, 4);

From what I can tell, the first dimension size is for the batch size in case I want to process more than one image. Is this correct?

Q2: Should I be structuring my data input in the same dimension structure used when invoking TfLiteInterpreterResizeInputTensor? I get the error in question with this image RGB input buffer:

// RGB range is 0-255. Scale it to 0-1.
for(int i = 0; i < imageSize; i++){
    imageDataBuffer[i] = (float)pImage[i] / 255.0;
}

I also get an error when building an input that mimics the input dimension given to TfLiteInterpreterResizeInputTensor, but this seems silly:

float imageData[1][224][224][3];
int j = 0;
for(int h = 0; h < 224; h++){
  for(int w = 0; w < 224; w++){
    imageData[0][h][w][0] = (float)pImage[j] * (1.0 / 255.0);
    imageData[0][h][w][1] = (float)pImage[j+1] * (1.0 / 255.0);
    imageData[0][h][w][2] = (float)pImage[j+2] * (1.0 / 255.0);

    j = j + 3;
  }
}

That last input structure is similar to the input structure used in the Python label_image.py when it does this:

input_data = np.expand_dims(img, axis=0)

Q3: What's wrong with my input buffer that makes TfLiteTensorCopyFromBuffer return an error code?

Thank you!

Full Code:

#include "tensorflow/lite/c/c_api.h"
#include "tensorflow/lite/c/c_api_experimental.h"
#include "tensorflow/lite/c/common.h"
#include "tensorflow/lite/c/builtin_op_data.h"
#include "tensorflow/lite/c/ujpeg.h"

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Dispose of the model and interpreter objects.
int disposeTfLiteObjects(TfLiteModel* pModel, TfLiteInterpreter* pInterpreter)
{
    if(pModel != NULL)
    {
      TfLiteModelDelete(pModel);
    }

    if(pInterpreter)
    {
      TfLiteInterpreterDelete(pInterpreter);
    }
}

// The main function.
int main(void) 
{
    TfLiteStatus tflStatus;

    // Create JPEG image object.
    ujImage img = ujCreate();

    // Decode the JPEG file.
    ujDecodeFile(img, "image_224x224.jpeg");

    // Check if decoding was successful.
    if(ujIsValid(img) == 0){
        return 1;
    }
    
    // There will always be 3 channels.
    int channel = 3;

    // Height will always be 224, no need for resizing.
    int height = ujGetHeight(img);

    // Width will always be 224, no need for resizing.
    int width = ujGetWidth(img);

    // The image size is channel * height * width.
    int imageSize = ujGetImageSize(img);

    // Fetch RGB data from the decoded JPEG image input file.
    uint8_t* pImage = (uint8_t*)ujGetImage(img, NULL);

    // The array that will collect the JPEG RGB values.
    float imageDataBuffer[imageSize];

    // RGB range is 0-255. Scale it to 0-1.
    int j=0;
    for(int i = 0; i < imageSize; i++){
        imageDataBuffer[i] = (float)pImage[i] / 255.0;
    }

    // Load model.
    TfLiteModel* model = TfLiteModelCreateFromFile("model.tflite");

    // Create the interpreter.
    TfLiteInterpreter* interpreter = TfLiteInterpreterCreate(model, NULL);

    // Allocate tensors.
    tflStatus = TfLiteInterpreterAllocateTensors(interpreter);

    // Log and exit in case of error.
    if(tflStatus != kTfLiteOk)
    {
      printf("Error allocating tensors.\n");
      disposeTfLiteObjects(model, interpreter);
      return 1;
    }
    
    int inputDims[4] = {1, 224, 224, 3};
    tflStatus = TfLiteInterpreterResizeInputTensor(interpreter, 0, inputDims, 4);

    // Log and exit in case of error.
    if(tflStatus != kTfLiteOk)
    {
      printf("Error resizing tensor.\n");
      disposeTfLiteObjects(model, interpreter);
      return 1;
    }

    tflStatus = TfLiteInterpreterAllocateTensors(interpreter);

    // Log and exit in case of error.
    if(tflStatus != kTfLiteOk)
    {
      printf("Error allocating tensors after resize.\n");
      disposeTfLiteObjects(model, interpreter);
      return 1;
    }

    // The input tensor.
    TfLiteTensor* inputTensor = TfLiteInterpreterGetInputTensor(interpreter, 0);

    // Copy the JPEG image data into into the input tensor.
    tflStatus = TfLiteTensorCopyFromBuffer(inputTensor, imageDataBuffer, imageSize);
    
    // Log and exit in case of error.
    // FIXME: Error occurs here.
    if(tflStatus != kTfLiteOk)
    {
      printf("Error copying input from buffer.\n");
      disposeTfLiteObjects(model, interpreter);
      return 1;
    }

    // Invoke interpreter.
    tflStatus = TfLiteInterpreterInvoke(interpreter);

    // Log and exit in case of error.
    if(tflStatus != kTfLiteOk)
    {
      printf("Error invoking interpreter.\n");
      disposeTfLiteObjects(model, interpreter);
      return 1;
    }

    // Extract the output tensor data.
    const TfLiteTensor* outputTensor = TfLiteInterpreterGetOutputTensor(interpreter, 0);

    // There are three possible labels. Size the output accordingly.
    float output[3];

    tflStatus = TfLiteTensorCopyToBuffer(outputTensor, output, 3 * sizeof(float));

    // Log and exit in case of error.
    if(tflStatus != kTfLiteOk)
    {
      printf("Error copying output to buffer.\n");
      disposeTfLiteObjects(model, interpreter);
      return 1;
    }

    // Print out classification result.
    printf("Confidences: %f, %f, %f.\n", output[0], output[1], output[2]); 

    // Dispose of the TensorFlow objects.
    disposeTfLiteObjects(model, interpreter);
    
    // Dispoice of the image object.
    ujFree(img);
    
    return 0;
}

EDIT #1: Ok, so inside TfLiteTensorCopyFromBuffer:

TfLiteStatus TfLiteTensorCopyFromBuffer(TfLiteTensor* tensor,
                                    const void* input_data,
                                    size_t input_data_size) {
    if (tensor->bytes != input_data_size) {
        return kTfLiteError;
    }

    memcpy(tensor->data.raw, input_data, input_data_size);
    return kTfLiteOk;
}

My input_data_size value is 150,528 (3 channels x 224 pixel height x 224 pixel width) but tensor->bytes is 602,112 (3 channels x 448 pixel height x 224 pixel 448, I assume?). I don't understand this discrepancy especially since I invoked TfLiteInterpreterResizeInputTensor with {1, 224, 224, 3}.

EDIT #2: I believe I have found my answer here. Will resolve this post once confirmed.


Solution

  • The solution I linked to on EDIT #2 was the answer. In the end, I just had to replace:

    TfLiteTensorCopyFromBuffer(inputTensor, imageDataBuffer, imageSize);

    with:

    TfLiteTensorCopyFromBuffer(inputTensor, imageDataBuffer, imageSize * sizeof(float));

    Cheers!