Search code examples
pythonpandasdataframepandas-groupby

Groupby by a column and select specific value from other column in pandas dataframe


Input dataframe:

+-------------------------------+
|ID        Owns_car    owns_bike|
+-------------------------------+
| 1          1               0  |
| 5          1               0  |
| 7          0               1  |
| 1          1               0  |
| 4          1               0  |
| 5          0               1  |
| 7          0               1  |
+-------------------------------+


Expected Output: 
+------------------------------+
|ID       Owns_car    owns_bike|
+------------------------------+
| 1          1               0 |
| 5          1               1 |
| 7          0               1 |
| 4          1               0 |
+------------------------------+

Grouping by ID and then selecting value '1' over 0 for the other columns. Checking if for a given ID the person owns a car and bike


Solution

  • You can use 'max' after your groupby to select the max value (which will prefer 1 over 0)

    df = pd.DataFrame({'ID': [1, 5, 7, 1, 4, 5, 7],
                       'Owns_car': [1, 1, 0, 1, 1, 0, 0],
                       'owns_bike': [0, 0, 1, 0, 0, 1, 1]})
    df.groupby('ID').max().reset_index()