Search code examples
pythonpandasdataframeone-hot-encoding

Is is possible to one-hot based on a list of values being an element in a column?


Here's the dataframe that I'm working with:

| Name     | Vehicle Types Owned  |
| -------- | --------------       |
| Bob      | [Car, Bike]          |
| Billy    | [Car, Truck, Train]  |
| Rob      | [Plane, Train, Boat] |
| Sally    | [Bike, Boat]         |

I am looking for something like this

| Name | Car | Bike | Truck | Train | Plane | Boat |
| ---- | ----| ---  | ---   | ---   | ---   | ---  |
| Bob  | 1   | 1    | 0     | 0     | 0     | 0    |                 
| Billy| 1   | 0    | 1     | 1     | 0     | 0    |
| Rob  | 0   | 0    | 0     | 1     | 1     | 1    |
| Sally| 0   | 1    | 0     | 0     | 0     | 1    |

The original dataframe looked like this, in case that'd be more useful to work with:

| Name     | Vehicle Type Owned | Num Wheels | Top Speed | 
| -------- | --------------     | ---------  | --------- |
| Bob      | Car                | 4          | 200 mph   |
| Bob      | Bike               | 2          | 20 mph    |
| Billy    | Car                | 4          | 220 mph   |
| Billy    | Truck              | 8          | 100 mph   |
| Billy    | Train              | 80         | 86  mph   |
| Rob      | Plane              | 3          | 600 mph   |
| Rob      | Train              | 80         | 98  mph   |
| Rob      | Boat               | 3          | 128 mph   |
| Sally    | Bike               | 2          | 34  mph   |
| Sally    | Boat               | 3          | 78 mph    |

I'm using pandas.


Solution

  • Try explode then crosstab

    df = df.explode('Vehicle Types Owned')
    df_ = pd.crosstab([df['Name']], df['Vehicle Types Owned']).reset_index().rename_axis(None, axis=1)
    
    print(df_)
    
        Name  Bike  Boat  Car  Plane  Train  Truck
    0  Billy     0     0    1      0      1      1
    1    Bob     1     0    1      0      0      0
    2    Rob     0     1    0      1      1      0
    3  Sally     1     1    0      0      0      0