Below is a simple reproducible example that works to illustrate the problem in its simple form. You can jump to the code and expected behaviour as the problem description can be long.
There are 3 dataframes stored in a list, and a form on the sidebar shows the supplier_name
and po_number
from the relevant dataframe. When the user clicks the Next
button, the information inside the supplier_name
and po_number
text_input will be saved (in this example, they basically got printed out on top of the sidebar).
This app works well when the user don't change anything inside the text_input, but if the user changes something, it breaks the app. See below pic for example, when I change the po_number
to somethingrandom
, the saved information is not somethingrandom
but p123
from the first dataframe.
What's more, if the information from the next dataframe is the same as the first dataframe, the changed value inside the text_input will be unchanged for the next display. For example, because the first and second dataframe's supplier name are both S1
, if I change the supplier name to S10
, then click next, the supplier_name
is still S10
on the second dataframe, while the second dataframe's supplier_name should be S1
. But if the supplier name for the next dataframe changed, the information inside the text_input
will be changed.
If you are struggling to understand why I want to do this, the original use for this is for the sidebar input area to extract information from each PDFs, then when the user confirms the information are all correct, they click next to review the next PDF. But if something is wrong, they can change the information inside the text_input, then click next, and the information of the changed value will be recorded, and for the next pdf, the extracted information should reflect on what the next pdf is. I did this in R shiny quite simply, but can't figure out how the data flow works here in streamlit, please help.
import streamlit as st
import pandas as pd
# 3 dataframes that are stored in a list
data1 = {
"supplier_name": ["S1"],
"po_number": ["P123"],
}
data2 = {
"supplier_name": ["S1"],
"po_number": ["P124"],
}
data3 = {
"supplier_name": ["S2"],
"po_number": ["P125"],
}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df3 = pd.DataFrame(data3)
list1 = [df1, df2, df3]
# initiate a page session state, every time next button is clicked
# it will go to the next dataframe in the list
if 'page' not in st.session_state:
st.session_state.page = 0
def next_page():
st.sidebar.write(f"Submitted! supplier_name: {supplier_name} po_number: {po_number}")
st.session_state.page += 1
supplier_name_value = list1[st.session_state.page]["supplier_name"][0]
po_number_value = list1[st.session_state.page]["po_number"][0]
# main area
list1[st.session_state.page]
# sidebar form
with st.sidebar.form("form"):
supplier_name = st.text_input(label="Supplier Name", value=supplier_name_value)
po_number = st.text_input(label="PO Number", value=po_number_value)
next_button = st.form_submit_button("Next", on_click=next_page)
The dataframe's info are extracted into the sidebar input area. The user can change the input if they wish, then click next, and the values inside the input areas will be saved. When it goes to the next dataframe, the values inside the text input will be refreshed to extract from the next dataframe, and repeats.
I'm not totally sure what you're going for, but after some messing around, the only way I was able to achieve this sort of sequential form submission handling is with st.experimental_rerun()
. I hate to resort to that since it may be removed any time, so hopefully there's a better way.
Without experimental_rerun()
, forms take two submits to actually update state. I wasn't able to find a "correct" way to achieve an immediate update to support the expected behavior.
Here's my attempt:
import pandas as pd # 1.5.1
import streamlit as st # 1.18.1
def initialize_state():
data = [
{
"supplier_name": ["S1"],
"po_number": ["P123"],
},
{
"supplier_name": ["S1"],
"po_number": ["P124"],
},
{
"supplier_name": ["S2"],
"po_number": ["P125"],
},
]
state.dfs = state.get("dfs", [pd.DataFrame(x) for x in data])
first_vals = [{x: df[x][0] for x in df.columns} for df in state.dfs]
state.selections = state.get("selections", first_vals)
state.pages_expanded = state.get("pages_expanded", 0)
state.current_page = state.get("current_page", 0)
state.just_modified_page = state.get("just_modified_page", -1)
def handle_submit(i):
st.session_state.selections[i] = {
"supplier_name": state.new_supplier_name,
"po_number": state.new_po_number,
}
state.current_page = i
state.just_modified_page = i
if i < len(state.dfs) - 1 and state.pages_expanded == i:
state.pages_expanded += 1
st.experimental_rerun()
def render_form(i):
with st.sidebar.form(key=f"form-{i}"):
supplier_name = state.selections[i]["supplier_name"]
po_number = state.selections[i]["po_number"]
if i == state.just_modified_page:
st.sidebar.write(
f"Submitted! supplier_name: {supplier_name} "
f"po_number: {po_number}"
)
state.just_modified_page = -1
state.new_supplier_name = st.text_input(
label="Supplier Name",
value=supplier_name,
)
state.new_po_number = st.text_input(
label="PO Number",
value=po_number,
)
if st.form_submit_button("Next"):
handle_submit(i)
state = st.session_state
initialize_state()
for i in range(state.pages_expanded + 1):
render_form(i)
# debug
st.write("state.pages_expanded", state.pages_expanded)
st.write("state.current_page", state.current_page)
st.write("state.just_modified_page", state.just_modified_page)
st.write("state.dfs[state.current_page]", state.dfs[state.current_page])
st.write("state.selections", state.selections)
I'm assuming you want to keep track of the user's selections, but not actually modify the dataframes. If you do want to modify the dataframes, that's simpler: replace state.selections
with actual writes to dfs
by index and column:
# ...
def handle_submit(i):
st.session_state.dfs[i]["supplier_name"] = state.new_supplier_name,
st.session_state.dfs[i]["po_number"] = state.new_po_number,
#st.session_state.selections[i] = {
# "supplier_name": state.new_supplier_name,
# "po_number": state.new_po_number,
#}
# ...
def render_form(i):
with st.sidebar.form(key=f"form-{i}"):
supplier_name = state.dfs[i]["supplier_name"][0]
po_number = state.dfs[i]["po_number"][0]
#supplier_name = state.selections[i]["supplier_name"]
#po_number = state.selections[i]["po_number"]
# ...
Now, it's possible to make this 100% dynamic, but I hardcoded supplier_name
and po_number
to avoid premature generalization that you may not need. If you do want to generalize, use df.columns
like initialize_state
does throughout the code.