Search code examples
pythonpandasdataframe

Pivot Pandas dataframe and take missed values from other dataframe


There are two dataframes:

table1

id |  time |  status
-----------------------
1  | 10:00 | conn     | 
1  | 10:01 | disconn  | 
2  | 10:02 | conn     | 
2  | 10:03 | disconn  | 
3  | 10:04 | conn     | 


table2

id |  time |
------------
3  | 10:05 |

If there is no disconn time value for ceratin id then take it from table2. How to get wished result ?

id | conn | disconn|
--------------------
1  | 10:00| 10:01  |
2  | 10:02| 10:03  |
3  | 10:04| 10:05  |

Solution

  • You can pivot, then fillna with map:

    out = (table1.pivot(index='id', columns='status', values='time')
                 .reset_index().rename_axis(columns=None)
          )
    
    out['disconn'] = out['disconn'].fillna(out['id'].map(table2.set_index('id')['time']))
    

    Variant for the second step:

    m = out['disconn'].isna()
    out.loc[m, 'disconn'] = out.loc[m, 'id'].map(table2.set_index('id')['time'])
    

    Output:

       id   conn disconn
    0   1  10:00   10:01
    1   2  10:02   10:03
    2   3  10:04   10:05
    

    If you have duplicated ids/status:

    out = (table1.assign(n=lambda d: d.groupby(['id', 'status']).cumcount())
                 .pivot(index=['id', 'n'], columns='status', values='time')
                 .reset_index().rename_axis(columns=None)
          )