Search code examples
pythonpython-3.xpandasdataframeconcatenation

pandas, Can't concat DataFrames


I am currently using pandas. So I tried to concat DataFrame, but it doesn't work, so I have a question. The code is as follows

df # shape is (27796, 876)
genes_pca # shape is (27796, 50)
cells_pca # shape is (27796, 15)
# concat dataframe axis=1 result shape is (27796, 926)
df = pd.concat([df, genes_pca, cells_pca], axis=1)

So, I got this error.

File "/opt/conda/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 287, in concat
moa-gpu_1  |     return op.get_result()
moa-gpu_1  |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 503, in get_result
moa-gpu_1  |     mgrs_indexers, self.new_axes, concat_axis=self.bm_axis, copy=self.copy,
moa-gpu_1  |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/internals/concat.py", line 84, in concatenate_block_managers
moa-gpu_1  |     return BlockManager(blocks, axes)
moa-gpu_1  |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 149, in __init__
moa-gpu_1  |     self._verify_integrity()
moa-gpu_1  |   File "/opt/conda/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 326, in _verify_integrity
moa-gpu_1  |     raise construction_error(tot_items, block.shape[1:], self.axes)
moa-gpu_1  | ValueError: Shape of passed values is (39742, 941), indices imply (31778, 941)

I don't know what you mean by numbers like (39742, 941) and (31778, 941) that you got in this error.


Solution

  • Have you tried to reset indexes?

    df.reset_index(drop=True, inplace=True)
    genes_pca.reset_index(drop=True, inplace=True)
    cells_pca.reset_index(drop=True, inplace=True)
    df = pd.concat([df, genes_pca, cells_pca], axis=1)
    

    Also it would make sense to check for duplicate index values in dataframes, e.g.

    df.index.is_unique
    

    If duplicates are present, they can be removed:

    df.drop_duplicates(inplace=True)