Search code examples
pythonpython-3.xpandasdataframeconcatenation

Pandas row concatenaton behaves unexpectedly: concatenates with w.r.t rows AND columns at the same time


I've done this 100's of times before with no problems but now I think I'm having a brainfart. I have the following two dataframes which I want to row-concatenate. I just want to add df2 to the bottom of df1.

df1: 
                  0         1         2  ...      4093      4094      4095
images0.jpg     0.0  0.000000  0.000000  ...  0.000000  0.000000  2.646948
images1.jpg     0.0  0.000000  0.000000  ...  0.000000  0.000000  2.341892
images2.jpg     0.0  0.000000  0.000000  ...  0.000000  2.771901  0.652107
images6.jpg     0.0  0.000000  0.000000  ...  0.000000  0.000000  1.415491
images7.jpg     0.0  0.000000  0.316132  ...  0.000000  0.000000  2.481199
            ...       ...       ...  ...       ...       ...       ...
images2901.jpg  0.0  0.000000  0.000000  ...  0.934915  0.000000  0.000000
images2902.jpg  0.0  0.000000  0.000000  ...  1.821516  0.000000  0.000000
images2903.jpg  0.0  0.594903  0.000000  ...  4.503857  1.291129  0.000000
images2904.jpg  0.0  0.000000  0.000000  ...  0.000000  2.801172  0.000000
images2905.jpg  0.0  0.000000  0.000000  ...  0.000000  6.153142  0.000000

[2903 rows x 4096 columns]
---------------------------------------------------------------------
df2: 
             0     1         2     3     ...      4092      4093  4094      4095
images3.jpg   0.0   0.0  0.000000   0.0  ...  0.000000  0.000000   0.0  2.298852
images4.jpg   0.0   0.0  0.000000   0.0  ...  0.593716  0.621494   0.0  0.386869
images5.jpg   0.0   0.0  1.153148   0.0  ...  0.048982  0.000000   0.0  2.601259

[3 rows x 4096 columns]

Both data frames only contain float64 elements. Then I just do df1 = df1.append(df2) or df1 = pd.concat([df1, df2], axis=0). However this gives me the following:

df1: 
                  0    1         2    3  ...  996       997       998  999
images0.jpg     NaN  NaN       NaN  NaN  ...  0.0  3.252266  0.000000  0.0
images1.jpg     NaN  NaN       NaN  NaN  ...  0.0  3.010184  0.000000  0.0
images2.jpg     NaN  NaN       NaN  NaN  ...  0.0  2.849794  6.082187  0.0
images6.jpg     NaN  NaN       NaN  NaN  ...  0.0  1.281688  0.000000  0.0
images7.jpg     NaN  NaN       NaN  NaN  ...  0.0  1.096831  0.000000  0.0
            ...  ...       ...  ...  ...  ...       ...       ...  ...
images2904.jpg  NaN  NaN       NaN  NaN  ...  0.0  1.820635  2.063830  0.0
images2905.jpg  NaN  NaN       NaN  NaN  ...  0.0  3.845408  0.415828  0.0
images3.jpg     0.0  0.0  0.000000  0.0  ...  NaN       NaN       NaN  NaN
images4.jpg     0.0  0.0  0.000000  0.0  ...  NaN       NaN       NaN  NaN
images5.jpg     0.0  0.0  1.153148  0.0  ...  NaN       NaN       NaN  NaN

[2906 rows x 8192 columns]

It seems it concatenates both w.r.t rows AND columns, however I just want to concatenate row-wise. I'm missing something silly/obvious aren't I?


Solution

  • I think problem is not same columns names, I guess in one DataFrame are strings and in another integers.

    So need same indices - here integers:

    pd.concat([df1.rename(columns=int), df2.rename(columns=int)])