Is it possible to copy a dataframe in the middle of a method chain to a new variable? Something like:
import pandas as pd
df = (pd.DataFrame([[2, 4, 6],
[8, 10, 12],
[14, 16, 18],
])
.assign(something_else=100)
.div(2)
.copy_to_new_variable(df_imag) # Imaginated method to copy df to df_imag.
.div(10)
)
print(df_imag)
would then return:
0 1 2 something_else
0 1.0 2.0 3.0 50.0
1 4.0 5.0 6.0 50.0
2 7.0 8.0 9.0 50.0
.copy_to_new_variable(df_imag)
could be replaced by df_imag = df.copy()
but this would result in compromising the method chain.
Actually, this is what I was looking for. Check the link, the idea is from Matt Harrison (who wrote multiple books about pandas) for debugging of method chains.
import pandas as pd
def to_df(df, name):
globals()[name] = df.copy()
return df
df = (pd.DataFrame([[1, 2, 3],
[10, 10, 10],
], columns=["A", "B", "C"]
)
.set_index("C")
.pipe(to_df, "df_imag")
.sum()
)
df_imag
is then the intermediate dataframe as described in the question.