Search code examples
machine-learninggoogle-prediction

Google Prediction max CSV size?


I uploading a 1.8Gb CSV file on Google Cloud Storage.

When I start the training from the Google API Explorer I get an error:

{
 "error": {
  "errors": [
   {
    "domain": "global",
    "reason": "invalid",
    "message": "Data size limit exceeded."
   }
  ],
  "code": 400,
  "message": "Data size limit exceeded."
 }
}

I'm confusing. From the FAQ I can read:

What training data does the Prediction API support?
Training data can be provided in one of three ways:
A CSV formatted training data file up to 2.5GB in size, loaded into Google Storage.

And from the pricing page:

Training:
$0.002/MB bulk trained (maximum size of each dataset: 250MB)


What is the difference from this 250MB and the 2.5GB ?


Solution

  • It was a small bug in the Google Prediction API.

    I posted the question on Google Group and the team fixed the bug pretty quickly: https://groups.google.com/forum/#!msg/prediction-api-discuss/Ap0WbdTco2g/kHoEMbJPteYJ