Search code examples
google-cloud-automl

Training the same Google AutoML Model multiple times


Question: Is it possible to train the same Model, from Google AutoML, multiple times?

Problem: I have several datasets with time series data. Example:

  • Dataset A: [[product1, date1, price], [product1, date2, price]]
  • Dataset B: [[product2, date1, price], [product2, date2, price]]
  • Dataset C: [[product3, date1, price], [product3, date2, price]]

When describing the columns in Google AutoML you can mark the data as time series data and specify the date column as the time serie. It is very important to keep in mind it is time series data. I'd think combining the datasets wouldn't be a good idea because there will be duplicate dates.

Is it possible to train the model on dataset A and after that finishes on dataset B, etc. or would you advice to combine the datasets?

Thanks.


Solution

  • You can combine the data, I don't see how that would matter with what you are describing. Marking a column as a Time column has AutoML Tables split the data based on that column, putting the oldest 80% as the training set, next more recent 10% as the validation set, and the most recent 10% as the test set.

    If there is not enough data in your set that is distinct in the time column to split the data as 80/10/10 described above, you will want to not mark it as the Time column and instead manually split the data.

    If the datasets are not related and are distinct from each other, then you would want to train individual models for each.