I working with the Titanic dataset and made some basic preprocessing (such as normalization, ohe, etc.).
Then, I tried to use H2O algorithm and got following error:
from h2o.estimators.gbm import H2OGradientBoostingEstimator
classifier = H2OGradientBoostingEstimator(nfolds = 5,
ntrees = 15,
seed = 42,
max_depth = 4)
classifier.train(predictors, target, training_frame = train_data)
H2OTypeError: Argument
x
should be a None | integer | string | ModelBase | list(string | integer) | set(integer | string), got H2OFrame
My target is train_data["Survived"].asfactor()
I tried to read to dataframe from file, instead of coverting the preprocessed df into H2OFrame but to no vail.
Any ideas would be appreciated.
It seems to me that you are passing frame instead of list of column names.
Both x
and y
are supposed to be "pointers" to columns in the training_frame
.
If you want to use all columns as predictors (all except the target), you can specify just the y
parameter.
Something like the following should do the trick:
train_data["Survived"] = train_data["Survived"].asfactor()
classifier = H2OGradientBoostingEstimator(nfolds = 5,
ntrees = 15,
seed = 42,
max_depth = 4)
classifier.train(x=["pclass", "sex", "age", ...], y="Survived", training_frame = train_data)