Search code examples
pythonpandasdataframemergeconcatenation

Merge/concatenate dataframes on date where one dataframe has multiple instances of the same date


python, df, pandas,merge/concatenate

left = pd.DataFrame(
    {
        "date": ["1999-11-30", "1999-12-31", "1999-11-30", "1999-12-31"],
        "stock": ["a", "a", "b", "b"]
    }
)

right = pd.DataFrame(
    {
        "date": [""1999-11-30", "1999-12-31"],
        "deflator": ["1", ".8"]
    }
)

...so anytime a date appears in left, add a column to left with the corresponding deflator from right


Solution

  • Set your index to be the date

    left = pd.DataFrame(
        {
            "stock": ["a", "a", "b", "b"],
        },
        index=["1999-11-30", "1999-12-31", "1999-11-30", "1999-12-31"],
    )
    
    right = pd.DataFrame(
        {
            "deflator": ["1", ".8"],
        },
        index=["1999-11-30", "1999-12-31"],
    )
    

    Then do a join left to right

    df = left.join(right, how="outer")
    
              stock deflator
    1999-11-30  a   1
    1999-11-30  b   1
    1999-12-31  a   .8
    1999-12-31  b   .8
    

    Outer join allows all of left to be kept, you may also want to make the index a datetime.