Search code examples
pythonpython-2.7pandasinner-join

Python pandas: Combine two dataframes by date index and a common column value


There are two dateframes, one is df1, another is df2 as follows:

df1:

               a   b    id
2010-01-01     1   4    21
2010-01-01     2   5    22
2010-01-01     3   6    23
2010-01-01     4   7    24
2010-01-02     1   4    21
2010-01-02     2   5    22
2010-01-02     3   6    23
2010-01-02     4   7    24
2010-01-03     1   4    21
2010-01-03     2   5    22
2010-01-03     3   6    23
2010-01-03     4   7    24
...........................

df2:

               c   d    id
2010-01-02     1   4    21
2010-01-02     2   5    22
2010-01-02     3   6    23
2010-01-02     4   7    24
2010-01-03     1   4    21
2010-01-03     2   5    22
2010-01-03     3   6    23
2010-01-03     4   7    24
...........................

I want to merge or join two dataframes by common index (please notice that some index in df1 is not in df2) and id, and I expected a joined dataframe as following

               c   d    a   b   id
2010-01-02     1   4    1   4   21
2010-01-02     2   5    2   5   22
2010-01-02     3   6    3   6   23
2010-01-02     4   7    4   7   24
2010-01-03     1   4    1   4   21
2010-01-03     2   5    2   5   22
2010-01-03     3   6    3   6   23
2010-01-03     4   7    4    7  24

I used the following code

 df = df1.join(df2, on = ['id'], how='inner')

but that did not work


Solution

  • IIUC:

    In [388]: df2.set_index('id', append=True).join(df1.set_index('id', append=True)) \
                 .reset_index(level='id')
    Out[388]:
                id  c  d  a  b
    2010-01-02  21  1  4  1  4
    2010-01-02  22  2  5  2  5
    2010-01-02  23  3  6  3  6
    2010-01-02  24  4  7  4  7
    2010-01-03  21  1  4  1  4
    2010-01-03  22  2  5  2  5
    2010-01-03  23  3  6  3  6
    2010-01-03  24  4  7  4  7