Search code examples
pythonpandasdataframemergeconcatenation

How to assign a new descriptive column while concatenating dataframes


I have two data frames that i want to concatenate in python. However, I want to add another column type in order to distinguish among the columns.

Here is my sample data:

import pandas as pd
df = pd.DataFrame({'numbers': [1, 2, 3], 'colors': ['red', 'white', 'blue']}, 
                  columns=['numbers', 'colors'])

df1 = pd.DataFrame({'numbers': [7, 9, 9], 'colors': ['yellow', 'brown', 'blue']}, 
                  columns=['numbers', 'colors'])
pd.concat([df,df1])

This code will give me the following result:

    numbers colors
0   1   red
1   2   white
2   3   blue
0   7   yellow
1   9   brown
2   9   blue

but what I would like to get is as follows:

  numbers colors  type
0   1   red       first
1   2   white     first
2   3   blue      first
0   7   yellow    second
1   9   brown     second
2   9   blue      second

type column is going to help me to differentiate between the values of the two data frames.

Can anyone help me with this please?


Solution

  • Use DataFrame.assign for new columns:

    df = pd.concat([df.assign(typ='first'),df1.assign(typ='second')])
    print (df)
       numbers  colors     typ
    0        1     red   first
    1        2   white   first
    2        3    blue   first
    0        7  yellow  second
    1        9   brown  second
    2        9    blue  second
    

    Using a list-comprehension

    df = pd.concat([d.assign(typ=f'id{i}') for i, d in enumerate([df, df1])], ignore_index=True)
    
       numbers  colors  typ
    0        1     red  id0
    1        2   white  id0
    2        3    blue  id0
    3        7  yellow  id1
    4        9   brown  id1
    5        9    blue  id1