Search code examples
pythonpandasdataframemonkeypatching

pandas DataFrame lives in "limbo" when monkey patching its constructor


I'm trying to monkey patch DataFrame constructor in pandas. The return value from the inner scope (inside the patch) disappears - it returns with None. Yet, in the outer scope, the DataFrame is constructed as expected, even though the patch return is the apparent None

import pandas as pd

f = pd.DataFrame.__init__


def make_df(*args, **kwargs):
    print('Called Before')
    df = f(*args, **kwargs)
    print(f"df from inner scope:\n{df}")
    return df


pd.DataFrame.__init__ = make_df

df = pd.DataFrame({'a': list('aab'), 'b': [1, 2, 3]})
print()
print(f'df from outer scope:\n{df}')

And the result:

Called Before
df from inner scope:
None

df from outer scope:
   a  b
0  a  1
1  a  2
2  b  3

What's the reason for that?


Solution

  • In python, __init__ normally doesn't return obj. It doesn't create object. Object already got created by __new__. Object is supposed already existed when you call __init__.

    __init__ puposes is initializing Objects attributes, so there is no reason to return anything from it. Just pull up any python source code, you will rarely see any return command in __init__

    inside your monkey patch the __init__ (makedf), your df captures the return of the original pd.DataFrame.__init__ which have no return. Therefore, it is None