In Python v2.7, if I have a function like this:
def holes_between(intervals):
# Compute the holes between the intervals, for example:
# given the intervals: ([ 8, 9] [14, 18] [19, 20] [23, 32] [34, 49])
# compute the holes: ([10, 13] [21, 22] [33, 33])
prec = intervals[0][1] + 1 # Bootstrap the iteration
for low, high in intervals[1:]:
if prec <= low - 1:
yield (prec, low - 1)
prec = high + 1
holes = list(holes_between(intervals))
Since the yield
of the function is being collected into a list
, is it more efficient to just build the list inside the holes_between
function and if so, how would that be done most efficiently?
A generator function can be less efficient than building a list directly.
You can just build the list in the holes_between()
function and return that:
def holes_between(intervals):
prec = intervals[0][1] + 1 # Bootstrap the iteration
result = []
for low, high in intervals[1:]:
if prec <= low - 1:
result.append((prec, low - 1))
prec = high + 1
return result
but do measure the differences, using the timeit
module.
If you have some typical input, you'd test that with:
import timeit
def holes_between_list(intervals):
prec = intervals[0][1] + 1 # Bootstrap the iteration
result = []
for low, high in intervals[1:]:
if prec <= low - 1:
result.append((prec, low - 1))
prec = high + 1
return result
def holes_between_generate(intervals):
prec = intervals[0][1] + 1 # Bootstrap the iteration
for low, high in intervals[1:]:
if prec <= low - 1:
yield (prec, low - 1)
prec = high + 1
intervals = [ ... ] # fill in some test data
print 'As list:', timeit.timeit(
'holes_between(intervals)',
'from __main__ import intervals, holes_between_list as holes_between')
print 'Using a generator:', timeit.timeit(
'list(holes_between(intervals))',
'from __main__ import intervals, holes_between_generate as holes_between')
The lower value is the faster method for your test data.