Search code examples
pythonconcatenation

concat result of apply in python


I am trying to apply a function on a column of a dataframe. After getting multiple results as dataframes, I want to concat them all in one.

Why does the first option work and the second not?

import numpy as np
import pandas as pd

def testdf(n):
    test = pd.DataFrame(np.random.randint(0,n*100,size=(n*3, 3)), columns=list('ABC'))
    test['index'] = n
    return test

test = pd.DataFrame({'id': [1,2,3,4]})

testapply = test['id'].apply(func = testdf)
#option 1
pd.concat([testapply[0],testapply[1],testapply[2],testapply[3]])

#option2
pd.concat([testapply])

Solution

  • pd.concat expects a sequence of pandas objects, but your #2 case/option passes a sequence of single pd.Series object that contains multiple dataframes, so it doesn't make concatenation - you just get that series as is.
    To fix your 2nd approach use unpacking:

    print(pd.concat([*testapply])) 
    

          A    B    C  index
    0    91   15   91      1
    1    93   85   91      1
    2    26   87   74      1
    0   195  103  134      2
    1    14   26  159      2
    2    96  143    9      2
    3    18  153   35      2
    4   148  146  130      2
    5    99  149  103      2
    0   276  150  115      3
    1   232  126   91      3
    2    37  242  234      3
    3   144   73   81      3
    4    96  153  145      3
    5   144   94  207      3
    6   104  197   49      3
    7     0   93  179      3
    8    16   29   27      3
    0   390   74  379      4
    1    78   37  148      4
    2   350  381  260      4
    3   279  112  260      4
    4   115  387  173      4
    5    70  213  378      4
    6    43   37  149      4
    7   240  399  117      4
    8   123    0   47      4
    9   255  172    1      4
    10  311  329    9      4
    11  346  234  374      4