Search code examples
pythonpandasdataframeconcatenation

concatenate list of dataframes by index in python


I have a question for and hope you can help me with. I am stuck and can't get this to work. I have a list of 3 list-DataFrames (3X5) like this:

[[df1, df2, df3, df4, df5],
 [df11, df12, df13, df14, df15],
 [df21, df22, df23, df24, df25]]

What I'd like to achieve is to concatenate the DataFrames into one list DataFrame (ie dfs) based on the list index:

dfs = [dfs1, dfs2, dfs3, dfs4, dfs5]

with

dfs1 = pd.concat([df1, df11, df21])
dfs2 = pd.concat([df2, df12, df22])
dfs3 = pd.concat([df3, df13, df23])   
dfs4 = pd.concat([df4, df14, df24])
dfs5 = pd.concat([df5, df15, df25])

Any suggestion is appreciated. Thank you!


Solution

  • You can use zip and store concatenated dataframes into a list of 5 elements

    lst = [[df1, df2, df3, df4, df5],
           [df11, df12, df13, df14, df15],
           [df21, df22, df23, df24, df25]]
    dfs = [pd.concat(x) for x in zip(*lst)]
    
    >>> len(dfs)
    5
    
    >>> dfs[0]
        0   1   2
    0  16  14  13  # <- df1
    1  18  17  12
    2  12  13  16
    3  11  16  17
    4  11  18  17
    0  18  13  12  # <- df11
    1  14  18  12
    2  11  17  13
    3  17  18  15
    4  18  18  12
    0  16  13  15  # <- df21
    1  16  11  18
    2  12  18  15
    3  16  14  14
    4  17  12  11
    
    >>> dfs[4]
        0   1   2
    0  52  54  54  # <- df5
    1  52  53  57
    2  56  51  53
    3  54  58  52
    4  57  51  51
    0  51  58  58  # <- df15
    1  57  54  56
    2  56  58  58
    3  53  55  56
    4  53  56  53
    0  51  53  54  # <- df25
    1  52  55  58
    2  52  54  56
    3  58  54  56
    4  52  58  56
    

    Note: if you need to concatenate along index axis, use pd.concat(..., axis=1) to get a wide dataframe instead of long one.

    Minimal Reproducible Example:

    lst = [[pd.DataFrame(np.random.randint(1+i*10, 9+i*10, (5, 3)))
            for i in range(1, 6)] for j in range(0, 3)]