Search code examples
pythonpandasdataframemergeconditional-statements

How to merge 3 DataFrames under certain conditions about date value in one of them in Python Pandas?


I have 3 DataFrames in Python Pandas like below:

df1 (ID - int, TIME - datetime)

ID  | TIME
----|------
123 | 2022-07-18
333 | 2022-07-22
444 | 2022-07-19
... | ...

df2 (both int)

ID  | VALUE
----|------
123 | 556 
333 | 12  
444 | 88  
... | ...

df3 (both int)

ID  | TIME
----|------
123 | 11114 
333 | 2
444 | 23 
... | ...

And I need to make a merge:

  • if TIME in df1 is < 2022-07-19 merge df1 with df2
  • if TIME in df1 is >= 2022-07-19 merge df1 with df3

So as a result I need something like below:

ID  | TIME       | VALUE
----|------------|-------
123 | 2022-07-18 | 556
333 | 2022-07-22 | 2
444 | 2022-07-19 | 23
... | ...        | ...

How can I do that in Python Pandas? OF course merge by ID col :)


Solution

  • If there are same index, same order ID in all 3 DataFrames use numpy.where:

    df1['VALUE'] = np.where(df1['TIME'] < '2022-07-19', df2['VALUE'], df3['VALUE'])
    

    If not use Series.map:

    df1['VALUE'] = np.where(df1['TIME'] < '2022-07-19', 
                            df1['ID'].map(df2.set_index('ID')['VALUE']), 
                            df1['ID'].map(df3.set_index('ID')['VALUE']))