Today i'm modeling a dataframe using PolinomialFeatures from sklearn but I keep encountering this error: ValueError: X has 10 features, but PolynomialFeatures is expecting 9 features as input.
Coming from the line where I generate the new data frame X_train_I
Here's my code
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(df.drop(["Faltas"],axis=1),df.Faltas,train_size = 0.8)
from sklearn.preprocessing import PolynomialFeatures
poly=PolynomialFeatures(interaction_only=True,include_bias=False).fit(X_train)
X_train_I=pd.DataFrame(poly.transform(df),columns=poly.get_feature_names(X_train.columns))
print(X_train_I.head(5))
type here
Any suggestions would be great! Thanks
I think you should transform X_train
and not df
:
# HERE --v
X_train_I = pd.DataFrame(poly.transform(X_train), columns=poly.get_feature_names(X_train.columns))
Or poly.transform(df.drop(columns='Faltas'))