Search code examples
pythonpandasdataframeconcatenation

Concatenating two dataframes causes column header of index to disappear


I have two dataframes df_old and df_new.

df_old has a date column and a bunch of other columns. df_new has ONLY a date column, which is also set as its index.

The dates in df_new are a superset of df_old

I want to concatenate the data frames, such that the columns of df_old line up with the dates in df_new, and any column that doesn't line up will have Nan.

df_new = pd.DataFrame()
df_new["date"] = pd.date_range(start='31/12/2023', end='31/12/2024')
df_new.set_index('date', inplace=True)
df_new = pd.concat([df_new , df_old], join="outer")

after doing this, however, the 'date' column name of df_new is now an empty string.

Any advice here? I tried renaming the column by index reference [0] but no joy

Thanks in advance


Solution

  • "df_new has ONLY a date column, which is also set as its index." so you almost have an Index (actually an DataFrame without columns).

    Thus you should rather reindex:

    out = df_old.reindex(df_new.index)
    

    Given your example of df_old, it looks like "date" is the column axis name.

    You could either keep it and remove the index name:

    out = df_old.reindex(df_new.index.rename(None))
    

    Or remove it:

    out = df_old.reindex(df_new.index).rename_axis(columns=None)
    

    Or both:

    out = df_old.reindex(df_new.index).rename_axis(index=None, columns=None)
    

    Or move it back to column:

    out = df_old.reindex(df_new.index).rename_axis(columns=None).reset_index()