I have trained a model using Google AutoML Tables. I would like to inspect the general structure of the trained model (which algos were used, what preprocessing if any, etc)
In the "Viewing model architecture with Cloud Logging" section of the docs here, I see:
If more than one model was used to create the final model, the hyperparameters for each model are returned as an entry in the
modelParameters
array, indexed by position (0, 1, 2, and so on)
My modelParameters
array is as shown below (with the first and last elements expanded). Im only doing the AutoML Tables quickstart, which uses the Bank marketing open-source dataset, so I'm suprised that it would return such a complex model (25 stacked/ensembled models?). I would think that just model 0 (a single Gradient Boosted Decision Tree with 300 trees and a max depth of 15) would be sufficient.
Also '25' is a suspiciously round number. Are we sure that the docs are correct and this list isn't actually the best 25, stackranked by accuracy score? Is there a better way to understand the end to end model (including preprocessing) that Google AutoML Tables is producing?
modelParameters: [
0: {
hyperparameters: {
Center Bias: "False"
Max tree depth: 15
Model type: "GBDT"
Number of trees: 300
Tree L1 regularization: 0
Tree L2 regularization: 0.10000000149011612
Tree complexity: 3
}
}
1: {…}
2: {…}
3: {…}
4: {…}
5: {…}
6: {…}
7: {…}
8: {…}
9: {…}
10: {…}
11: {…}
12: {…}
13: {…}
14: {…}
15: {…}
16: {…}
17: {…}
18: {…}
19: {…}
20: {…}
21: {…}
22: {…}
23: {…}
24: {
hyperparameters: {
Center Bias: "False"
Max tree depth: 9
Model type: "GBDT"
Number of trees: 500
Tree L1 regularization: 0
Tree L2 regularization: 0
Tree complexity: 0.10000000149011612
}
}
]
}
The docs and your original interpretation of the results are correct. In this case, AutoML Tables created an ensemble of 25 models.
It also provides the full list of individual models tried during the search process if you click on the "Trials" link. That should be much larger list than 25.