Search code examples
pythonpandasplotlystreamlit

What is the proper syntax to use for @st.cache_data for my webpage?


I have created a webpage that will extract data from a database and and show it as a line chart using streamlit. I recently found out that i can cache the repetitive data using @st.cache_data, but I don't understand what the proper syntax should be in my specific code such that it doesnt throw any errors. Below is the way i have tried to implement it but it is throwing errors related to plotly which has only started to occur after I introduced the st.cache operation to the code so i think it has something to do with the way my code is handling the datafr_creator(). Please assist what should be the proper way to cache a dataframe? I couldn't find other implementations where a for loop is used.

#initialize dataframe
df = pd.DataFrame()

@st.cache_data(ttl=120)
def datafr_creator():
    global df
    for row in supabaseList:
        row["created_at"] = row["created_at"].split(".")[0]
        row["time"] = row["created_at"].split("T")[1]
        row["date"] = row["created_at"].split("T")[0]
        row["DateTime"] = row["created_at"]
        df = df.append(row, ignore_index=True)
    return (df)

datafr_creator()
#creating local list

#Display section
orignal_title = '<h1 style="font-family:Helvetica; color:Black; font-size: 45px; text-align: center">Tridev Water Monitoring System</p>'
st.markdown(orignal_title, unsafe_allow_html=True)
st.text("")
fig = px.area(df, x="DateTime", y="water_level", title='',markers=False)
fig.update_layout(
    title={
        'text': "Water level in %",
        'y':0.9,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'})
fig.update_layout(yaxis_range = [0,120])
fig.update_layout(xaxis_range = custom_range)

#Add Horizontal line in plotly chart for pump trigger level
fig.add_hline(y=80, line_width=3, line_color="black",
              annotation_text="Pump Start Level",
              annotation_position="top left",
              annotation_font_size=15,
              annotation_font_color="black"
              )

#Final Chart print
st.plotly_chart(fig,use_container_width=True)

It is currently throwing this error:

ValueError: Value of 'x' is not the name of a column in 'data_frame'. Expected one of [] but received: DateTime

I want to cache the dataframe so that on every run the code doesn't have to fetch data from the database, then process it inside the for-loop. All of which is time consuming.


Solution

  • The problem comes from the fact that df does not contain any column named "DateTime" here: fig = px.area(df, x="DateTime", y="water_level", title='',markers=False).

    It means that for some reason, your dataframe must be empty.

    Moreover, you shouldn't use global with the cache. It is not necessary. Simply call the datafr_creator function with the cache decorator and then, retrieve its result as shown in the Streamlit docs.

    It gives:

    import pandas as pd
    import streamlit as st
    
    @st.cache_data(ttl=120)
    def datafr_creator():
        df = pd.DataFrame()
        for row in supabaseList:
            row["created_at"] = row["created_at"].split(".")[0]
            row["time"] = row["created_at"].split("T")[1]
            row["date"] = row["created_at"].split("T")[0]
            row["DateTime"] = row["created_at"]
            df = df.append(row, ignore_index=True)
        return df
    
    # Here, it will call datafr_creator and use the cache if the
    # dataframe has already been generated.
    df = datafr_creator()
    
    orignal_title = '<h1 style="font-family:Helvetica; color:Black; font-size: 45px; text-align: center">Tridev Water Monitoring System</p>'
    st.markdown(orignal_title, unsafe_allow_html=True)
    st.text("")
    fig = px.area(df, x="DateTime", y="water_level", title='', markers=False)
    fig.update_layout(
        title={
            'text': "Water level in %",
            'y':0.9,
            'x':0.5,
            'xanchor': 'center',
            'yanchor': 'top'
        }
    )
    fig.update_layout(yaxis_range=[0, 120])
    fig.update_layout(xaxis_range=custom_range)
    
    # Add Horizontal line in plotly chart for pump trigger level
    fig.add_hline(
        y=80, line_width=3, line_color="black",
        annotation_text="Pump Start Level",
        annotation_position="top left",
        annotation_font_size=15,
        annotation_font_color="black"
    )
    
    # Final Chart print
    st.plotly_chart(fig,use_container_width=True)