Search code examples
cachingprogress-barstreamlit

Streamlit don't show progress bar in cache_data function


With streamlit, I am setting a heavy function to cache_data type to avoid recalculation. Since the function is time consuming, I also want to create a progress bar inside it. However, I find the it cannot work with cache_data specified.

The following is a MWE

import streamlit as st
from time import sleep

@st.cache_data(show_spinner = False)
def showProgressBar():
    cur = 0
    total = 100
    my_bar = st.progress(cur / total, text = "%d / %d" % (cur, total))
    while cur < total:
        sleep(0.05)
        cur = cur + 1
        my_bar.progress(cur / total, text = "%d / %d" % (cur, total))
    my_bar.empty()

### Main ###
st.set_page_config(page_title="Test Progress In Cache Function", page_icon=":bar_chart:",layout="wide")
st.title(" :bar_chart: Test")
showProgressBar()
st.text('Test Finish')

Turn out the bar never shows up with such code. But if I comment out the @st.cache_data line, the progress bar works as expected.

In This thread similar problem is mentioned about progress and st.cache, the workaround seems to be related to suppress_st_warning = True, however with the depreciation of st.cache, this parameter seems no longer available to st.cache_data

Anyone can provide some help here?


Solution

  • If your heavy calculation is sleep(0.05), here is a way to do it. This piece of code extracts the heavy calculation out of the progress bar handling making sure the cache_data decorator can be used only on non-Streamlit operations:

    import streamlit as st
    from time import sleep
    from random import randint
    
    @st.cache_data(show_spinner = False)
    def do_heavy_calc(n):
        print(f"First time seeing {n}")
        sleep(0.05)
    
    
    def showProgressBar():
        cur = 0
        total = 100
        my_bar = st.progress(cur / total, text = "%d / %d" % (cur, total))
        while cur < total:
            n = randint(1, 100)
            # Add a random number since I suppose you don't want to run
            # the exact same function everytime(?)
            do_heavy_calc(n)
            cur = cur + 1
            my_bar.progress(cur / total, text = "%d / %d" % (cur, total))
        my_bar.empty()
    
    ### Main ###
    st.set_page_config(page_title="Test Progress In Cache Function", page_icon=":bar_chart:",layout="wide")
    st.title(" :bar_chart: Test")
    showProgressBar()
    st.text('Test Finish')
    

    If however you wanted to keep track of one long function that you call only once, you can convert it to a generator that yields the current step at which the do_heavy_calc function is currently at. I don't recommend this however (it feels a bit hacky), but it seems to work fine in my tests.

    Two things to note:

    • cache_data is replaced by cache_resource since cache_data raises streamlit.runtime.caching.cache_errors.UnserializableReturnValueError.
    • I put a underscore (_) in front of the 2 input arguments: this is to ensure that they will not be used for caching purposes. See this streamlit documentation page at section "Excluding input parameters".
    import streamlit as st
    from time import sleep
    
    @st.cache_resource(show_spinner = False)
    def do_heavy_calc(_cur, _total):
        while _cur < _total:
            # yield where you are so that the progress bar on the outside can
            # keep track
            yield _cur
            sleep(0.05)
            _cur = _cur + 1
    
    def showProgressBar():
        min_ = 0
        max_ = 100
        my_bar = st.progress(min_ / max_, text = "%d / %d" % (min_, max_))
        for c in do_heavy_calc(0, max_):
            my_bar.progress(c / max_, text = "%d / %d" % (c, max_))
        my_bar.empty()
    
    ### Main ###
    st.set_page_config(page_title="Test Progress In Cache Function", page_icon=":bar_chart:",layout="wide")
    st.title(" :bar_chart: Test")
    showProgressBar()
    st.text('Test Finish')