I am looking to do either a multivariate Linear Regression or a Logistic Regression using Python on some data that has a large number of categorical variables. I understand that with one Categorical variable I would need to translate this into a dummy and then remove one type of dummy so as to avoid colinearity however is anyone familiar with what the approach should be when dealing with more than one type of categorical variable?
Do I do the same thing for each? e.g translate each type of record into a dummy variable and then for each remove one dummy variable so as to avoid colinearity?
In a case where there is more than one categorical variable that needs to be replaced for a dummy. The approach should be to encode each of the variables for a dummy (as in the case for a single categorical variable) and then remove one instance of each dummy that exists for each variable in order to avoid colinearity.
Basically, each categorical variable should be treated the same as a single individual one.