In the documentation of TransformedTargetRegressor, it is mentioned that the parameter func
needs to return a 2-dimensional array. Should it not be a 1-dimensional array instead? The target y mostly has the shape (n_samples,) which is 1-dimensional.
The below code, where target y and the output of func
is 1-dimensional, runs properly -
exponentiate = lambda x: np.exp(x)
naturalLog = lambda x: np.log(x)
loglinreg = compose.TransformedTargetRegressor(regressor=linear_model.LinearRegression(),func=naturalLog,inverse_func=exponentiate)
loglinreg.fit(X_train,yCO_train)
loglinreg.score(X_train,yCO_train)
In the source, func
is applied using a FunctionTransformer
, which requires 2-dimensional input. This also aligns with the other option, setting a transformer
object directly, which generally expect 2-dimensional input.
See also the Note in the docs:
Internally, the target
y
is always converted into a 2-dimensional array to be used by scikit-learn transformers. At the time of prediction, the output will be reshaped to a have the same number of dimensions asy
.
In your example, it runs because np.log
and np.exp
are shape-agnostic; during the fitting, those two functions are actually being called on 2-dimensional arrays. You can check this by defining your own func:
def mylog(y):
return np.log(y).ravel()
Using that, we get the expected Expected 2D array, got 1D array instead
error.