I want to save the time and mark the object as modified, so I wrote a class and override its __setattr__
function.
import time
class CacheObject(object):
__slots__ = ('modified', 'lastAccess')
def __init__(self):
object.__setattr__(self,'modified',False)
object.__setattr__(self,'lastAccess',time.time())
def setModified(self):
object.__setattr__(self,'modified',True)
object.__setattr__(self,'lastAccess',time.time())
def resetTime(self):
object.__setattr__(self,'lastAccess',time.time())
def __setattr__(self,name,value):
if (not hasattr(self,name)) or object.__getattribute__(self,name)!=value:
object.__setattr__(self,name,value)
self.setModified()
class example(CacheObject):
__slots__ = ('abc',)
def __init__(self,i):
self.abc = i
super(example,self).__init__()
t = time.time()
f = example(0)
for i in range(100000):
f.abc = i
print(time.time()-t)
I measured the process time, and it took 2 seconds. When I commented out overridden function, the process time was 0.1 second, I know the overridden function would be slower but almost 20 times the gap is too much. I think I must get something wrong.
Taking the suggestion from cfi:
Eliminate the if condition
def __setattr__(self,name,value):
# if (not hasattr(self,name)) or object.__getattribute__(self,name)!=value:
object.__setattr__(self,name,value)
self.setModified()
This takes the running time down to 1.9s, a little improvement, but marking the object modified if it's not changed would cost more in other process, so not an option.
Change self.func to classname.func(self)
def __setattr__(self,name,value):
if (not hasattr(self,name)) or object.__getattribute__(self,name)!=value:
object.__setattr__(self,name,value)
CacheObject.setModified(self)
Runtime is 2.0s, so nothing really changed
Extract setmodified function
def __setattr__(self,name,value):
if (not hasattr(self,name)) or object.__getattribute__(self,name)!=value:
object.__setattr__(self,name,value)
object.__setattr__(self,'modified',True)
object.__setattr__(self,'lastAccess',time.time())
This pushes the runtime down to 1.2s! That's great, and it does save almost 50% time, though the cost is still high.
Not a complete answer but some suggestions:
Can you eliminate the value comparison? Of course that's a feature change of your implementation. But the overhead in runtime will be even worse if more complex objects than integers are being stored in attributes.
Every call to a method via self
needs to go through full method resolution order checking. I don't know if Python could do any MRO caching itself. Probably not because of the types-being-dynamic principle. Thus, you should be able to reduce some overhead by changing any self.method(args)
to classname.method(self, args)
. That removes the MRO overhead from the calls. This applies to self.setModified()
in your settattr()
implementation. In most places you have done this already with references to object
.
Every single function call takes time. You could eliminate them and e.g. move setModified
's functionality into __setattr__
itself.
Let us know how the timing changes for each of these. I'd split out the experiment.
Edit: Thanks for the timing numbers.
The overhead may seem drastic (still a factor of 10 it seems). However put that into perspective of overall runtime. In other words: How much of you overall runtime will be spent in setting those tracked attributes and how much time is spent elsewhere?
In a single-thread application Amdahl's Law is a simple rule to set expectations straight. An illustration: If 1/3 of the time is spend setting attributes, and 2/3 doing other stuff. Then slowing down the attribute setting by 10x will only slow down the 30%. The smaller the percentage of time spent with the attributes, the less we have to care. But this may not help you at all if your percentage is high...