Search code examples
pythondataframetransformtransposeone-hot-encoding

set row's value become column and fill it based on other column's row's value


i have a dataframe like this:

id  category    year    freq
101 1           2020    1
101 1           2021    1
202 2           2020    2
202 2           2021    6
203 3           2021    2

I need to transform the dataframe based on id, category and year's value and fill the year's value with frequency for the year. The desired output is:

id  category    2020    2021
101 1           1       1
202 2           2       6
203 3           0       2

i have tried using one hot encoding, but the i can't fill each year's column with frequency.


Solution

  • Seems like a job for df.pivot_table. Notice that we'll use fill_value=0 to replace missing values with 0 (to match your expected output):

    >>> df.pivot_table(values="freq", index=["id", "category"], columns="year", fill_value=0)
    year          2020  2021
    id  category            
    101 1            1     1
    202 2            2     6
    203 3            0     2