Search code examples
pythonpython-itertoolstqdm

tqdm progress bar for chained iterables


If I want to combine two iterators in Python, one approach is to use itertools.chain.

For example, if I have two ranges range(50, 100, 10) and range(95, 101), I can get a range [50, 60, 70, 80, 90, 95, 96, 97, 98, 99, 100] with itertools.chain(range(50, 100, 10), range(95, 101)).

tqdm is an extensible progress bar in Python. However, by default it doesn't seem to be able to count the number of items in a itertools.chain expression, even when they are fixed.

One solution is to convert the range to a list. However this approach doesn't scale.

Is there a way to ensure tqdm understands chained iterators?

from tqdm import tqdm
import itertools
import time

# shows progress bar
for i in tqdm(range(50, 100, 10)):
    time.sleep(1)
   
# does not know number of items, does not show full progress bar
for i in tqdm(itertools.chain(range(50, 100, 10), range(95, 101))):
    time.sleep(1)
    
# works, but doesn't scale
my_range = [*itertools.chain(range(50, 100, 10), range(95, 101))]    
for i in tqdm(my_range):
    time.sleep(1)

Solution

  • This is more of a workaround than an answer, because it just looks like tqdm can't handle it right now. But you can just find the length of the two things you chained together and then include the argument total= when you call tqdm.

    from tqdm import tqdm
    from itertools import chain
    
    # I started with a list of iterables
    iterables = [iterable1, iterable2, ...]
    
    # Get the length of each iterable and then sum them all
    total_len = sum([len(i) for i in iterables])
    
    # Then chain them together
    main_iterable = chain(*iterables)
    
    # And finally, run it through tqdm
    # Make sure to use the `total=` argument
    for thing in tqdm(main_iterable, total=total_len):
        # Do stuff