Search code examples
pythonserializationpicklecomposition

python serialization of composite object


I have the following code snippet:

import pickle
class Date:
    def __init__(self, d=1, m=1, y=1):
        self.Day = d
        self.Month = m
        self.Year = y

    def __str__(self):
        return str(self.Day) + "-" + str(self.Month) + "-" + str(self.Year)

class Person:
    def __init__(self, n=0,dob=Date(0,0,0)):
        self.no = n
        self.DOB = dob

    def __str__(self):
        return "No = " + str(self.no) + ", DOB = " + str(self.DOB)
#main
f = open("a.dat", "wb")
d=dict()
p=Person()
p.no = int(raw_input("Enter empl no: "))
p.DOB.Day = int(raw_input("Enter day: "))
p.DOB.Month = int(raw_input("Enter Month: "))
p.DOB.Year = int(raw_input("Enter Year: "))             
d[p.no] = p
p=Person()
p.no = int(raw_input("Enter empl no: "))
p.DOB.Day = int(raw_input("Enter day: "))
p.DOB.Month = int(raw_input("Enter Month: "))
p.DOB.Year = int(raw_input("Enter Year: "))             
d[p.no] = p
pickle.dump(d,f)
f.close()
#now open the file again
f = open("a.dat", "rb")
d = pickle.load(f)
for p in d.values():
    print str(p)

I have two persons stored in a dictionary that gets saved in a file after serialization. Both the persons have different DOB, but while loading back from the file, it shows the same DOB. The input and output are as follows:

Enter empl no: 1
Enter day: 1
Enter Month: 1
Enter Year: 2001
Enter empl no: 2
Enter day: 2
Enter Month: 2
Enter Year: 2002
No = 1, DOB = 2-2-2002
No = 2, DOB = 2-2-2002

What is wrong here? Why the dates are showing same although both objects have different dates. Please suggest. There is some discussions regarding “Least Astonishment” in Python: The Mutable Default Argument, but what to do if I like to have different dates as entered for different Person objects?


Solution

  • The problem is when initialising with default object parameters:

    def __init__(self, n=0,dob=Date(0,0,0)):
    

    As you'll see in this discussion, the Date constructor is not called for every call you make to your method. Instead, it's called once when the module is first loaded and then the same instance is always used. Your assumption that you have different DOB is wrong.

    Edit: The common paradigm when handling situations like this, if you still want to keep the default argument behaviour, is to assign None and checking it for initialisation. In your case, this is what that means:

    def __init__(self, n=0,dob=None):
    
       # By calling `Date` in the initialiser's body, it's guaranteed to generate a new instance for every call
       if dob is None:
          dob = Date(0,0,0)
    
       # At this point self.DOB is initialised with either a new instance or a given one
       self.DOB = dob