Search code examples
pythonpandasdataframebooleanexplode

Python - Pandas - DataFrame - Explode single column into multiple boolean columns based on conditions


Good morning chaps,

Any pythonic way to explode a dataframe column into multiple columns with boolean flags, based on some condition (str.contains in this case)?

Let's say I have this:

Position Letter 
1        a      
2        b      
3        c      
4        b      
5        b

And I'd like to achieve this:

Position Letter is_a     is_b    is_C
1        a      TRUE     FALSE   FALSE
2        b      FALSE    TRUE    FALSE
3        c      FALSE    FALSE   TRUE
4        b      FALSE    TRUE    FALSE
5        b      FALSE    TRUE    FALSE 

Can do with a loop through 'abc' and explicitly creating new df columns, but wondering if some built-in method already exists in pandas. Number of possible values, and hence number of new columns is variable.

Thanks and regards.


Solution

  • use Series.str.get_dummies():

    In [31]: df.join(df.Letter.str.get_dummies())
    Out[31]:
       Position Letter  a  b  c
    0         1      a  1  0  0
    1         2      b  0  1  0
    2         3      c  0  0  1
    3         4      b  0  1  0
    4         5      b  0  1  0
    

    or

    In [32]: df.join(df.Letter.str.get_dummies().astype(bool))
    Out[32]:
       Position Letter      a      b      c
    0         1      a   True  False  False
    1         2      b  False   True  False
    2         3      c  False  False   True
    3         4      b  False   True  False
    4         5      b  False   True  False