I have a df as
name category dummy
USA fx,ft,fe 1
INDIA fx 13
I need to convert this as
name category_fx categoty_ft category_fe dummy
USA True True True 1
INDIA True False False 13
tried with series.explode() function but not getting this output.
Use Series.str.get_dummies
by column category
with converting 0,1
to boolean by DataFrame.astype
and DataFrame.add_prefix
:
c = df.columns.difference(['category'], sort=False).tolist()
df = (df.set_index(c)['category']
.str.get_dummies(',')
.astype(bool)
.add_prefix('category_')
.reset_index())
print (df)
name category_fe category_ft category_fx
0 USA True True True
1 INDIA False False True
EDIT: If need replace one column by multiple columns you can use:
df1 = (df['category']
.str.get_dummies(',')
.astype(bool)
.add_prefix('category_'))
pos = df.columns.get_loc('category')
df = pd.concat([df.iloc[:, :pos], df1, df.iloc[:, pos+1:]], axis=1)
print (df)
name category_fe category_ft category_fx dummy
0 USA True True True 1
1 INDIA False False True 13
This solution is modifid for multiple columns:
print (df)
name category dummy category1
0 USA fx,ft,fe 1 a,f
1 INDIA fx 13 s,a
cols = ['category','category1']
dfs = [(df[c].str.get_dummies(',').astype(bool).add_prefix(f'{c}_')) for c in cols]
df = pd.concat([df, *dfs], axis=1).drop(cols, axis=1)
print (df)
name dummy category_fe category_ft category_fx category1_a \
0 USA 1 True True True True
1 INDIA 13 False False True True
category1_f category1_s
0 True False
1 False True