Im trying to LabelEncode particular columns of a Dataframe. I have stored those column names in a list(cat_features). Now i want to use a For loop to iterate through this list's elements (which are strings) and use those elements to access dataframe's column. but it says
TypeError: argument must be a string or number
Since Im accessing the element of the list which is already a string. so i dont understand why it throw that error. Please help me understand why it doesn't work and what can I do to make it work.
cat_features = [x for x in features if x not in features_to_scale]
from sklearn.preprocessing import LabelEncoder
for feature in cat_features:
le = LabelEncoder()
dataframe[feature] = le.fit_transform(dataframe[feature])
The error means that one or more of your columns contains a list/tuple/set or something similar. For this, you will need to convert the list/tuple to a string before you can apply a label encoder
Also, instead of a loop, you can first filter your data frame by the features you need then use apply function -
df = main_df[cat_features]
df = df.astype(str) #This step changes each column to string as label encoder cant work on lists/tuples/sets
lb = LabelEncoder()
df.apply(lb.fit_transform)
Later you can combine this data frame with the remaining continuous features.