I sought through the documentation but still have no clue whether or not the service shuffles data before training/evaluation. I need to know this because by data is time-series which would be realistic to evaluate a trained model on samples of earlier period of time.
Can someone please let me know the answer or guide me how to figure this out? I know that I can export evaluation result and tweak on it but BigQuery seems to not respect the order of original data and there's no absolute time feature in the data.
It doesn't shuffle but split it.
Take a look here: About controlling data split. It says:
By default, AutoML Tables randomly selects 80% of your data rows for training, 10% for validation, and 10% for testing.
If your data is time-sensitive, you should use the Time column.
By using it, AutoML Tables will use the earliest 80% of the rows for training, the next 10% of rows for validation, and the latest 10% of rows for testing.