python pandas numpy concatenation autofill

How to copy values from two dataframes with datetime index, to fill one data frame?

I want to define a function as follow...

I have two dataframes, in all rows I have float values. Both data frames have date time as index. Both data frames have different columns.

I want to take the smallest data frame and copy those columns into biggest data frame. if smallest data frame start after biggest data frame then I'd like to fill the first dates in biggest data frame with the first values in the first row in smallest data frame.

If smallest data frame finish before, then I'd like to fill the last dates in the biggest data frame with the last values in the last row in the smallest data frame.

The result must be the biggest data frame with columns of the smallest df with the last mentioned details

Here's what I did, the problem is I have null values.

def copy_up(df1, df2):
''' df1 and df2 must finish with the same date when you pass them as argument
df1, df2: Date time index, all float values in all columns
Returns the biggest data frame with all columns of the smallest df, fill the initial days repeating the first row of smallest df
'''

    #Checking columns are unique
    columnas_df1 = set(df1.columns)
    columnas_df2 = set(df2.columns)
    columnas_repetidas = columnas_df1.intersection(columnas_df2)


    if len(columnas_repetidas) > 0:
        print("¡Hay columnas repetidas en los DataFrames!")
        print("Columnas repetidas:", columnas_repetidas)
        return 0
    else:
        print("No hay columnas repetidas en los DataFrames.")
    # Compare which one is bigger

    if len(df1) > len(df2):
        df_grande = df1.copy()
        df_pequeno = df2.copy()
    else:
        df_grande = df2.copy()
        df_pequeno = df1.copy()

    #N of times we will repeat the first row is restante
    restante = len(df_grande) - len(df_pequeno)

    #Repeating first row in biggest data frame
    for i in df_pequeno.columns:
        lista_ultimos = list(df_pequeno[i].values)
        lista_primer = [lista_ultimos[0]] * restante
        final = lista_primer
        final.extend(lista_ultimos)
        df_grande[i] = final
    
    return df_grande

I did something similar with copy_down(df1, df2) function then I did pd.concat([copy_down(df1, df2), copy_up(df1, df2)])

Solution

Assuming these two DataFrames are similar to the ones you refer to:

import pandas as pd

# Example DataFrame 1 (biggest)
df1 = pd.DataFrame(
    {
        "A": [1.1, 2.2, 3.3, 4.4, 5.5],
        "B": [6.6, 7.7, 8.8, 9.9, 10.0],
    },
    index=pd.date_range(start="2024-01-01", periods=5),
)

# Example DataFrame 2 (smallest)
df2 = pd.DataFrame(
    {
        "C": [11.1, 12.2, 13.3],
        "D": [14.4, 15.5, 16.6],
    },
    index=pd.date_range(start="2024-01-03", periods=3),
)

              A     B
2024-01-01  1.1   6.6
2024-01-02  2.2   7.7
2024-01-03  3.3   8.8
2024-01-04  4.4   9.9
2024-01-05  5.5  10.0
               C     D
2024-01-03  11.1  14.4
2024-01-04  12.2  15.5
2024-01-05  13.3  16.6

If I understand you logic, you can do an outer merge, and then back-fill and forward-fill the empty values:

df = pd.merge(df1, df2, how="outer", left_index=True, right_index=True).bfill().ffill()

For the example the result would be:

              A     B     C     D
2024-01-01  1.1   6.6  11.1  14.4
2024-01-02  2.2   7.7  11.1  14.4
2024-01-03  3.3   8.8  11.1  14.4
2024-01-04  4.4   9.9  12.2  15.5
2024-01-05  5.5  10.0  13.3  16.6