Search code examples
pythonpandasmerge

pandas- merge to datasets with diff col names adding values from the second into the first


I have a couple of dataframes:

DTime               A B C
2023-02-21 00:00:01 0 0 0 
2023-02-21 00:00:02 0 1 0 
2023-02-21 00:00:03 0 0 2 
2023-02-21 00:00:04 4 2 0 


DTime             AAA BBB CC DDD EE
2023-02-21 00:00:01 0 0 0 1 0
2023-02-21 00:00:02 0 1 0 0 0
2023-02-21 00:00:03 0 0 2 0 1
2023-02-21 00:00:04 1 0 0 0 0

I need to merge the first into the second where I explicitly map the cols and sum the values. i.e. - I want A to sum into AAA, B into BBB and C into CCC:

DTime            AAA BBB CC DDD EE
2023-02-21 00:00:01 0 0 0 1 0
2023-02-21 00:00:02 0 2 0 0 0
2023-02-21 00:00:03 0 0 4 0 1
2023-02-21 00:00:04 5 2 0 0 0

I can't seem to make this happen without going line-by-line... But there must be a better way since I have thousands of rows and hundreds of cols.


Solution

  • Code

    m = {'A': 'AAA', 'B': 'BBB', 'C':'CC'}
    out = (df2.set_index('DTime')
              .add(df1.set_index('DTime').rename(m, axis=1), fill_value=0)
              .reset_index()
    )
    

    out:

                     DTime  AAA  BBB  CC  DDD   EE
    0  2023-02-21 00:00:01    0    0   0  1.0  0.0
    1  2023-02-21 00:00:02    0    2   0  0.0  0.0
    2  2023-02-21 00:00:03    0    0   4  0.0  1.0
    3  2023-02-21 00:00:04    5    2   0  0.0  0.0
    

    Example Code

    import pandas as pd
    data1 = {'DTime': ['2023-02-21 00:00:01', '2023-02-21 00:00:02', '2023-02-21 00:00:03', '2023-02-21 00:00:04'], 'A': [0, 0, 0, 4], 'B': [0, 1, 0, 2], 'C': [0, 0, 2, 0]}
    data2 = {'DTime': ['2023-02-21 00:00:01', '2023-02-21 00:00:02', '2023-02-21 00:00:03', '2023-02-21 00:00:04'], 'AAA': [0, 0, 0, 1], 'BBB': [0, 1, 0, 0], 'CC': [0, 0, 2, 0], 'DDD': [1, 0, 0, 0], 'EE': [0, 0, 1, 0]}
    df1 = pd.DataFrame(data1)
    df2 = pd.DataFrame(data2)