Search code examples
pythonpandasdataframedatatablespivot

Conversion of df one column values to multiple column values in pandas


id date decision
1 2022-11-10 improve
1 2022-11-10 checked
2 2021-09-12 checked
3 2020-08-22 checked
4 2019-11-10 complete
4 2019-11-10 revise

Converting above dataframe as

id date CR Principal
1 2022-11-10 checked improve
2 2021-09-12 checked NA
3 2020-08-22 checked NA
4 2019-11-10 revise complete

Solution

  • Use GroupBy.cumcount with ascending=False for counter in descending order and pivoting by 4 columns, then use rename - add keys to dictionary for rename if 3 or 4 duplicated decisions:

    df = (df.assign(g = df.groupby(['id','date']).cumcount(ascending=False))
            .pivot(['id','date'], 'g', 'decision')
            .reindex(columns=range(4))
            .fillna(0)
            .rename(columns={0:'CR',1:'Principal',2:'final',3:'post final'})
            .rename_axis(columns=None)
            .reset_index())
    print (df)
       id        date       CR Principal  final  post final
    0   1  2022-11-10  checked   improve    0.0         0.0
    1   2  2021-09-12  checked         0    0.0         0.0
    2   3  2020-08-22  checked         0    0.0         0.0
    3   4  2019-11-10   revise  complete    0.0         0.0