Search code examples
tensorflowobject-detectiongoogle-cloud-automlautoml

Google Cloud Automl Object Detection Training


I'm trying to train an Object Detection model with Google Cloud's Automl service. I have loaded my dataset and I get a message:

You have enough images to start training

So I set up my training on the dataset and a few minutes later I get an error message:

Error: Failed to train model.

When I click on details, I get:

Error details

Operation ID:
    projects/<redacted>/locations/us-central1/operations/<redacted>
Error Messages:
    INTERNAL

How do I make it run successfully?


Solution

  • The error in this case was caused by uploading duplicate images. I uploaded detections using the csv format, which accepts a maximum csv of 100 MB. Since my csv was too large I broke it up into four chunks using the split command. Many of my images had multiple detections. It turned out that the last detection in one of the csvs had an image that also belonged to the first detection in the second csv.

    I went in and manually edited my csvs to not include duplicate google storage urls across csvs, re-imported, trained, and it worked.