I am trying to predict the stock trend where 1 is the stock increases and 0 is the stock decreases on that particular day. My input features are close price, volume, current day trend and my output is the trend the next day. When applying XGBClassifier() I encounter an error:
ValueError Traceback (most recent call last)
<ipython-input-101-d14cdb520e55> in <module>
1 val = np.array(test[0, 0]).reshape(1, -1)
2
----> 3 pred = model.predict(val)
4 print(pred[0])
~/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py in predict(self, data, output_margin, ntree_limit, validate_features, base_margin)
968 if ntree_limit is None:
969 ntree_limit = getattr(self, "best_ntree_limit", 0)
--> 970 class_probs = self.get_booster().predict(
971 test_dmatrix,
972 output_margin=output_margin,
~/opt/anaconda3/lib/python3.8/site-packages/xgboost/core.py in predict(self, data, output_margin, ntree_limit, pred_leaf, pred_contribs, approx_contribs, pred_interactions, validate_features, training)
1483
1484 if validate_features:
-> 1485 self._validate_features(data)
1486
1487 length = c_bst_ulong()
~/opt/anaconda3/lib/python3.8/site-packages/xgboost/core.py in _validate_features(self, data)
2058 ', '.join(str(s) for s in my_missing))
2059
-> 2060 raise ValueError(msg.format(self.feature_names,
2061 data.feature_names))
2062
ValueError: feature_names mismatch: ['f0', 'f1', 'f2', 'f3'] ['f0']
expected f2, f1, f3 in input data
My code is as follows:
def xgb_predict(train, val):
train = np.array(train)
x, y = train[:, :-1], train[:, -1]
model = XGBClassifier()
model.fit(x, y)
val = np.array(val).reshape(1, -1)
pred = model.predict(val)
return pred[0]
xgb_predict(train, test[0, 0])
I receive an error on the 8th line. Thank you very much for the help:)
The column selection of your test data must be done as you do for the train data. This means that your last line should be:
xgb_predict(train, test[0, :-1])
so you can select all columns/features but the last one which is the target value.