Search code examples
pythonpremature-optimization

Optimization of importing modules in Python


I am reading David Beazley's Python Reference book and he makes a point:

For example, if you were performing a lot of square root operations, it is faster to use 'from math import sqrt' and 'sqrt(x)' rather than typing 'math.sqrt(x)'.

and:

For calculations involving heavy use of methods or module lookups, it is almost always better to eliminate the attribute lookup by putting the operation you want to perform into a local variable first.

I decided to try it out:

first()

def first():
    from collections import defaultdict
    x = defaultdict(list)

second()

def second():
    import collections
    x = collections.defaultdict(list)

The results were:

2.15461492538
1.39850616455

Optimizations such as these probably don't matter to me. But I am curious as to why the opposite of what Beazley has written comes out to be true. And note that there is a difference of 1 second, which is singificant given the task is trivial.

Why is this happening?

UPDATE:

I am getting the timings like:

print timeit('first()', 'from __main__ import first');
print timeit('second()', 'from __main__ import second');

Solution

  • The from collections import defaultdict and import collections should be outside the iterated timing loops, since you won't repeat doing them.

    I guess that the from syntax has to do more work that the import syntax.

    Using this test code:

    #!/usr/bin/env python
    
    import timeit
    
    from collections import defaultdict
    import collections
    
    def first():
        from collections import defaultdict
        x = defaultdict(list)
    
    def firstwithout():
        x = defaultdict(list)
    
    def second():
        import collections
        x = collections.defaultdict(list)
    
    def secondwithout():
        x = collections.defaultdict(list)
    
    print "first with import",timeit.timeit('first()', 'from __main__ import first');
    print "second with import",timeit.timeit('second()', 'from __main__ import second');
    
    print "first without import",timeit.timeit('firstwithout()', 'from __main__ import firstwithout');
    print "second without import",timeit.timeit('secondwithout()', 'from __main__ import secondwithout');
    

    I get results:

    first with import 1.61359190941
    second with import 1.02904295921
    first without import 0.344709157944
    second without import 0.449721097946
    

    Which shows how much the repeated imports cost.