Search code examples
pythonpandaspython-3.6concatenation

Error on concatenate 2 data frames with indexes as a list of strings


I need to concatenate 2 data frames which indexes are strings. However, I am facing the following error:

TypeError: Cannot concatenate list of ['DataFrame', 'DataFrame']

I did some research, but I didn't found explanation about this error.

Here is an example of what is happening:

Data frame 1:

    a=pd.DataFrame({'a1':[1,2,3],'a2':[3,4,5],'i':[7,10,11]})

Data Frame 2:

    b=pd.DataFrame({'b1':[5,6],'b2':[7,8],'i':[7,11]})

If I reset the index to the columns 'i' that is a list of integers, I can reach the result intended, as below:

code:

    import pandas as pd

    a=pd.DataFrame({'a1':[1,2,3],'a2':[3,4,5],'i':[7,10,11]})

    b=pd.DataFrame({'b1':[5,6],'b2':[7,8],'i':[7,11]})

    print('\na:\n',a,'\n\nb:\n',b)

    new_a = a.set_index('i', drop=True)

    new_b=b.set_index('i', drop=True)

    print('\nnew_a:\n',new_a,'\n\nnew_b:\n',new_b)

    c = pd.concat([new_a,new_b],axis=1)

    print('\n\nc:\n',c,'\n')

output:

    a:
        a1  a2   i
    0   1   3   7
    1   2   4  10
    2   3   5  11 

    b:
        b1  b2   i
    0   5   7   7
    1   6   8  11

    new_a:
         a1  a2
    i         
    7    1   3
    10   2   4
    11   3   5 

    new_b:
         b1  b2
    i         
    7    5   7
    11   6   8 

    c:
         a1  a2   b1   b2
    i                   
    7    1   3  5.0  7.0
    10   2   4  NaN  NaN
    11   3   5  6.0  8.0 

However, if I reset the index to the columns 'i' that is a list of strings, the error comes up, as below:

code:

    import pandas as pd

    a=pd.DataFrame({'a1':[1,2,3],'a2':[3,4,5],'i':['i7','i10','i11']})

    b=pd.DataFrame({'b1':[5,6],'b2':[7,8],'i':['i7','i11']})

    print('\na:\n',a,'\n\nb:\n',b)

    new_a = a.set_index('i', drop=True)

    new_b=b.set_index('i', drop=True)

    print('\nnew_a:\n',new_a,'\n\nnew_b:\n',new_b)

    c = pd.concat([new_a,new_b],axis=1)

    print('\n\nc:\n',c,'\n')

output:

    a:
        a1  a2    i
    0   1   3   i7
    1   2   4  i10
    2   3   5  i11 

    b:
        b1  b2    i
    0   5   7   i7
    1   6   8  i11

    new_a:
          a1  a2
    i          
    i7    1   3
    i10   2   4
    i11   3   5 

    new_b:
          b1  b2
    i          
    i7    5   7
    i11   6   8

    Traceback (most recent call last):

      File "<ipython-input-266-6b35fe042e4a>", line 13, in <module>
        c = pd.concat([new_a,new_b],axis=1)

      File "C:\mypath\concat.py", line 225, in concat
        copy=copy, sort=sort)

      File "C:\Users\mypath\concat.py", line 378, in __init__
        self.new_axes = self._get_new_axes()

      File "C:\Users\mypath\concat.py", line 445, in _get_new_axes
        new_axes[i] = self._get_comb_axis(i)

      File "C:\Users\mypath\concat.py", line 470, in _get_comb_axis
        .format(types=types))

    TypeError: Cannot concatenate list of ['DataFrame', 'DataFrame']

If anyone could help me on figuring out what is the issue, I will appreciate.

Thanks.


Solution

  • Use pd.concat with sort=False:

    c = pd.concat([new_a,new_b],axis=1,sort=False)
    

    if the previous option does not work then use: DataFrame.merge

    c=a.merge(b,on='i',how='outer').set_index('i')
    

    Output:

         a1  a2   b1   b2
    i7    1   3  5.0  7.0
    i10   2   4  NaN  NaN
    i11   3   5  6.0  8.0