Search code examples
pythonpandasdataframeconcatenation

Merging and Concating using pandas


Hi I have 2 dataframes as below:

key Apple Banana
abc 1 12
bcd 23 21
key Train Car
abc 11 20
jkn 2 19

I want to merge these 2 dataframes together with my key column so that I can get following table:

key Train Car Banana Apple
abc 11 20 12 1
jkn 2 19 0/NA 0/NA
bcd 0/NA 0/NA 21 23

For columns where I don't have any record like for jkn / Apple either 0 or NA should be printed.

Currently I tried using pd.concat but am not exactly able to figure out how to get my desired result.


Solution

  • Use pd.merge() with how='outer', read further in documentation:

    import pandas as pd
    import io
    
    data_string = """key    Apple   Banana
    abc 1   12
    bcd 23  21
    """
    
    df1 = pd.read_csv(io.StringIO(data_string), sep='\s+')
    
    data_string = """key    Train   Car
    abc 11  20
    jkn 2   19"""
    df2 = pd.read_csv(io.StringIO(data_string), sep='\s+')
    
    # Solution
    df_result = pd.merge(df1, df2, on=['key'], how='outer')
    
    print(df_result)
       key  Apple  Banana  Train   Car
    0  abc    1.0    12.0   11.0  20.0
    1  bcd   23.0    21.0    NaN   NaN
    2  jkn    NaN     NaN    2.0  19.0