I have a pandas dataframe (N = 1485) that looks like this:
ID Intervention
1 Blood Draw, Flushed, Locked
1 Blood Draw, Port De-Accessed, Heparin-Locked, Tubing Changed
1 Blood Draw, Flushed
2 Blood return Verified, Flushed
2 Cap Changed
3 Port De-Accessed
I want to be able to dummy code out each of the string before every comma so it looks similar to this:
ID Blood Draw Flushed Locked ....
1 Yes Yes Yes
1 Yes No No
...
Thanks!
You can use pd.Series.str.get_dummies
and a dictionary mapping:
d = {1: 'yes', 0: 'no'}
res = df.join(df.pop('Intervention').str.get_dummies(', ').applymap(d.get))
In my opinion, it's best to convert to strings for display purposes only. Boolean values are more efficiently held and manipulated in Boolean series.
Result
print(res)
ID Blood Draw Blood return Verified Cap Changed Flushed Heparin-Locked \
0 1 yes no no yes no
1 1 yes no no no yes
2 1 yes no no yes no
3 2 no yes no yes no
4 2 no no yes no no
5 3 no no no no no
Locked Port De-Accessed Tubing Changed
0 yes no no
1 no yes yes
2 no no no
3 no no no
4 no no no
5 no yes no
Setup
df = pd.DataFrame({'ID': [1, 1, 1, 2, 2, 3],
'Intervention': ['Blood Draw, Flushed, Locked',
'Blood Draw, Port De-Accessed, Heparin-Locked, Tubing Changed',
'Blood Draw, Flushed', 'Blood return Verified, Flushed',
'Cap Changed', 'Port De-Accessed']})