pandas python-2.7 machine-learning scikit-learn one-hot-encoding

How do I apply one hot encoding on a pandas dataframe with both categorical and numerical features?

Some features are numerical such as "graduation rate from school", while other features are categorical like the name of the school. I used a label encoder on the features that are categorical to transform them into integers.

I now have a dataframe with both floats and integers, representing numerical features and categorical features(transformed with label encoder) respectively.

I am unsure how to proceed with a learner, do I need to use one hot encoding? And if so, how can I do so? I cannot simply pass the dataframe to the sklearn OneHotEncoder since there are floats, according to my current understanding. Do I just apply the label encoder to all features to solve the issue?

Sample data from my dataframe. OPEID and opeid6 were transformed using a label encoder

Solution

Just use the OneHotEncoder categorical_features argument to select with features are categorical:

categorical_features: “all” or array of indices or mask :

Specify what features are treated as categorical.

‘all’ (default): All features are treated as categorical.

array of indices: Array of categorical feature indices.

mask: Array of length n_features and with dtype=bool.

Non-categorical features are always stacked to the right of the matrix.