Search code examples
pandasdataframejoinpandas-groupbytransform

insert column to df on sequenced location


i have a df like this:

id month
1 1
1 3
1 4
1 6

i want to transform it become like this:

id 1 2 3 4 5 6
1 1 0 1 1 0 1

ive tried using this code:

ndf = df[['id']].join(pd.get_dummies(
    df['month'])).groupby('id').max()

but it shows like this:

id 1 3 4 6
1 1 1 1 1

how can i insert the middle column (2 and 5) even if it's not in the data?


Solution

  • You can use pd.crosstab instead, then create new columns using pd.RangeIndex based on the min and max month, and finally use DataFrame.reindex (and optionally DataFrame.reset_index afterwards):

    import pandas as pd
    
    new_cols = pd.RangeIndex(df['month'].min(), df['month'].max())
    
    res = (
        pd.crosstab(df['id'], df['month'])
         .reindex(columns=new_cols, fill_value=0)
         .reset_index()
    )
    

    Output:

    >>> res
    
       id  1  2  3  4  5
    0   1  1  0  1  1  0