I have a class that contains a pandas DataFrame (self.my_df), and updating self.my_df does not work how I expect it to. Here's a simplified version of the code that illustrates my problem:
class my_obj(object):
@property
def my_df(self):
if not hasattr(self, "_my_df"):
self._my_df = pandas.DataFrame({ "A" : [1,2,3,],
"B" : [4,5,6]}).fillna("")
print("Retrieving!")
return self._my_df
@my_df.setter
def my_df(self, my_new_df):
print("Setting!")
self._my_df = my_new_df.copy()
Here's what happens when I (try to) call these methods (from inside a separate instance method that I don't think matters here):
ipdb> self.my_df
Retrieving!
A B
0 1 4
1 2 5
2 3 6
ipdb> self.my_df.loc[2, "B"] = "x"
Retrieving!
ipdb> self.my_df
Retrieving!
A B
0 1 4
1 2 5
2 3 x
ipdb> self._my_df
A B
0 1 4
1 2 5
2 3 x
I would expect self.my_df.loc[2, "B"] = "x"
to call the setter, which it doesn't, or——if it doesn't——then I would expect self._my_df not to be set, which it is.
What's happening here? My real situation is much more complex, but I believe this is the root confusion for me.
Thanks for helping me clear this up.
It's easier to see what's happening if you break down the steps. Instead of
self.my_df.loc[2, "B"] = "x"
consider
temp = self.my_df # Clearly this should call the get method
temp.loc[2, "B"] = "x" # Changes the pandas object
These two snippets achieve the same result. The setter will not be called, since you are not assigning to the my_df
property of the my_obj
object. You are retrieving the contents of self.my_df
(which is a dataframe), and then manipulating it.
A my_obj
object only holds a reference to a DataFrame
, so unless you point my_df
to a different object, the setter will not be called. With your code, the my_obj
object still points to the same dataframe, but you have manipulated the dataframe's contents.