as the title suggests, I want to be able to do the following (best explained with some code) [pandas 0.20.1
is mandatory]
import pandas as pd
import numpy as np
a = pd.DataFrame(np.random.rand(10, 4), columns=[['a','a','b','b'], ['alfa','beta','alfa','beta',]])
def as_is(x):
return x
def power_2(x):
return x**2
# desired result
a.transform([as_is, power_2])
the problem is the function could be more complex than this and thus I would lose the "naming" feature as pandas.DataFrame.transform
only allows for lists to be passed whereas a dictionary would have been most convenient.
going back to the basics, I got to this:
dict_funct= {'as_is': as_is, 'power_2': power_2}
def wrapper(x):
return pd.concat({k: x.apply(v) for k,v in dict_funct.items()}, axis=1)
a.groupby(level=[0,1], axis=1).apply(wrapper)
but the output Dataframe is all nan
, presumably due to multi-index
columns ordering. is there any way I can fix this?
If need dict
I remove paramater axis
in concat
to to default (axis=0
), but then is necessary add parameter group_keys=False
and function unstack
:
def wrapper(x):
return pd.concat({k: x.apply(v) for k,v in dict_funct.items()})
a.groupby(level=[0,1], axis=1, group_keys=False).apply(wrapper).unstack(0)
Similar solution:
def wrapper(x):
return pd.concat({k: x.transform(v) for k,v in dict_funct.items()})
a.groupby(level=[0,1], axis=1, group_keys=False).apply(wrapper).unstack(0)
Another solution is simply add list comprehension
:
a.transform([v for k, v in dict_funct.items()])