I have a generator that generates a finite sequence. To determine the length of this sequence I tried these two approaches:
seq_len = sum([1 for _ in euler14_seq(sv)]) # list comp
and
seq_len = sum(1 for _ in euler14_seq(sv)) # generator expression
sv
is a constant starting value for the sequence.
I had expected that list comprehension would be slower and the generator expression faster, but it turns out the other way around.
I assume the first one will be much more memory intensive since it creates a complete list in memory first - part of the reason I also thought it would be slower.
My question: Is this observation generalizable? And is this due to having two generators involved in the second statement vs the first?
I've looked at these What's the shortest way to count the number of items in a generator/iterator?, Length of generator output, and Is there any built-in way to get the length of an iterable in python? and saw some other approaches to measuring the length of a sequence, but I'm specifically curious about the comparison of list comp vs generator expression.
PS: This came up when I decided to solve Euler Project #14 based on a question asked on SO yesterday.
(By the way, what's the general feeling regarding use of the '_'
in
places where variable values are not needed).
This was done with Python 2.7.2 (32-bit) under Windows 7 64-bit
On this computer, the generator expression becomes faster somewhere between 100,000 and 1,000,000
$ python -m timeit "sum(1 for x in xrange(100000))"
10 loops, best of 3: 34.8 msec per loop
$ python -m timeit "sum([1 for x in xrange(100000)])"
10 loops, best of 3: 20.8 msec per loop
$ python -m timeit "sum(1 for x in xrange(1000000))"
10 loops, best of 3: 315 msec per loop
$ python -m timeit "sum([1 for x in xrange(1000000)])"
10 loops, best of 3: 469 msec per loop