I have a dataset where I am trying to predict the type of car based off of a number of features. I would like to an OLS regression to see
import statsmodels.api as sm
X = features
# where 0 = sedan, 1 = minivan , etc
y = [0,0,1,0,2,....]
X2 = sm.add_constant(np.array(X))
est = sm.OLS(np.array(y), X2)
est2 = est.fit()
^ I don't feel like doing this is correct because I am not specifying that it is categorical, I feel like the functional form should change. Was wondering if anyone had any insight on this.
Ordinary least squares regression assumes a numerical dependent variable, you cannot use it to predict categorical outcomes.
To predict categorical outcomes with a regression model, you want to use multinomial logistic regression, for example using sklearn.