After training a model with autoML tool of H2O, I can see the variable importance with saved_model.varimp_plot()
. I am curious about the feature engineering part whic H2O claims to do.
I'm trying simple lines of code sapmles in the documentation of H2O.
import h2o
h2o.init()
train_data = h2o.import_file("../full_data.csv")
test_data = h2o.import_file("../201810_pca.csv")
from h2o.automl import H2OAutoML
y = "Label"
x = ['feature0','feature1','feature2','feature3','feature4','feature5','feature6','feature7','feature8','feature9','feature10',
'feature11','feature12','feature13','feature14','feature15','feature16','feature17','feature18','feature19','feature20',
'feature21','feature22','feature23','Amount','DateTime']
aml = H2OAutoML(max_models = 100, max_runtime_secs=100000, seed = 1)
aml.train(x = x, y = y, training_frame = train_data)
lb = aml.leaderboard
lb.head()
lb.head(rows=lb.nrows) # Entire leaderboard
preds = aml.predict(test_data)
h2o.save_model(aml.leader, path = "./Saved_Models")
saved_model = h2o.load_model("./Saved_Models/XGBoost_2_AutoML_20191018_174201")
training_frame = your_model.actual_params['training_frame'] #The part gives error
print(training_frame)
How do I see which features are being used in the trained model? I'd like to see if H2O is extracting and adding new features or not.
I've used my_training_frame = your_model.actual_params['training_frame']
as stated in another question but it gives error: "TypeError: 'property' object has no attribute 'getitem'".
Quick Note H2O.ai has a few products. The open source platform is called H2O-3 and it contains the AutoML algorithm. AutoML does not currently do feature engineering for you. If you want automatic feature engineering, you might be thinking of H2O's product Driverless-AI.
As for the error you are seeing, this is a bug and you can track the fix here.
Depending on what you pass to the .train()
method, you may or may not hit this bug.