Search code examples
python-3.xpandaspandas-groupbypivot-tabledummy-variable

Pandas Group By And Get Dummies


I want to make get dummy variables per unique value. Idea is to turn the data frame into a multi-label target. How can I do it?

Data:

           ID                      L2
           A                 Firewall
           A                 Security
           B           Communications
           C                 Business
           C                 Switches

Desired Output:

ID   Firewall  Security  Communications  Business   Switches
 A      1          1             0              0         0
 B      0          0             1              0         0
 C      0          0             0              1         1

I have tried pd.pivot_table but it requires a column to aggregate on. I have also tried answer on this link but it sums the values rather than just turning into binary dummy columns. I would much appreciate your help. Thanks a lot!


Solution

  • Let us set_index then get_dummies, since we have multiple duplicate in each ID ,we need to sum with level = 0

    s = df.set_index('ID')['L2'].str.get_dummies().max(level=0).reset_index()
    Out[175]: 
      ID  Business  Communications  Firewall  Security  Switches
    0  A         0               0         1         1         0
    1  B         0               1         0         0         0
    2  C         1               0         0         0         1