Trying to convert after encoding to integers but they are objects so i first turn them into strings
train_df["labels"] = train_df["labels"].astype(str).astype(int)
I am getting this error
invalid literal for int() with base 10: '[0, 1, 0, 0]
An example of a row from the dataset is
text labels
[word1,word2,word3,word4] [1,0,1,0]
It's because after train_df["labels"].astype(str)
, this Series became a Series of lists, so you can't convert a list into type int
.
If each element in train_df["labels"]
is of type list
, you can do:
train_df["labels"].apply(lambda x: [int(el) for el in x])
If it's of type str
, you can do:
train_df["labels"].apply(lambda x: [int(el) for el in x.strip("[]").split(",")])
You presumably you want to train some model but you can't use pd.Series of lists to do it. You'll need to convert this into a DataFrame. I can't say how to do that without looking at more than 1 line of data.