Search code examples
pythonpandasdata-scienceanalyticsdata-cleaning

Python Dataframe Binary Encoding


I have a dataframe that looks like this:

User Product
1 a
1 b
2 a
2 c
3 b

I want 1 row per user with the products as columns where it gives a 1 or 0 if the user purchased the product or not, how can I do this?


Solution

  • df.pivot_table(index="User", columns="Product", aggfunc=len).fillna(0)
    
    # Result:
    
    Product    a    b    c
    User                  
    1        1.0  1.0  0.0
    2        1.0  0.0  1.0
    3        0.0  1.0  0.0