Suppose I'd like to remove '$'
signs from my dataframe in Pandas. And I have created a class called TransformFunctions
so that I can use getattr()
to invoke function from that class (the reason being that I am using another JSON file in which I will list the method names associated with columns in the data to do the processing; because JSON only accepts strings, I decided to invoke methods based on the string using a suggestion given here).
The code is as below:
class TransformFunctions(object):
def remove_dollar(self, cell_str):
return float(cell_str.replace("$", "").replace(",", ""))
data = {
'dpt':[868, 868, 69],
'name':['B J SANDIFORD', 'C A WIGFALL', 'A E A-AWOSOGBA'],
'address':[' DEPARTMENT OF CITYWIDE ADM', 'DEPARTMENT OF CITYWIDE ADM ', ' HRA/DEPARTMENT OF SOCIAL S '],
'ttl#':['12702', '12702', '52311'],
'pc':[' X ',' X', 'A '],
'sal-rate':['$5.00', '$5.00', '$51,955.00']
}
df = pd.DataFrame(data)
klass = TransformFunctions()
df['sal-rate'] = df['sal-rate'].apply(getattr(klass,'remove_dollar')()) ## here, I get TypeError: remove_dollar() missing 1 required positional argument: 'cell_str'
I'd like to know how to use apply
from pandas.DataFrame
to invoke methods via getattr
if possible. Thank you in advance for your suggestions/answers!
The reason is getattr
returns method remove_dollar
and you called it inside of apply
without parameter when you put ()
at the end of getattr(...)
. You should do this (i.e. remove ()
):
df['sal-rate'] = df['sal-rate'].apply(getattr(klass,'remove_dollar'))
Out[952]:
address dpt name pc sal-rate ttl#
0 DEPARTMENT OF CITYWIDE ADM 868 B J SANDIFORD X 5.0 12702
1 DEPARTMENT OF CITYWIDE ADM 868 C A WIGFALL X 5.0 12702
2 HRA/DEPARTMENT OF SOCIAL S 69 A E A-AWOSOGBA A 51955.0 52311
Besides, why don't you call apply
using directly klass.remove_dollar
such as:
df['sal-rate'].apply(klass.remove_dollar)
Out[955]:
0 5.0
1 5.0
2 51955.0
Name: sal-rate, dtype: float64