Search code examples
pythonpandasdataframestreamlit

Unable to concatenate dataframes in streamlit


I am trying to concatenate all the dataframes which are starting with user_ string in the streamlit but have been getting error.

sample code:

import streamlit as st
import pandas as pd

st.set_page_config(page_title="Science",
                    layout='wide',
                    initial_sidebar_state="expanded")


# sample dataframes
st.session_state.user_df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
st.session_state.user_df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
st.session_state.df3 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})


# list of dataframes starting with user_
st.session_state.users_list = [name for name in st.session_state if name.startswith('user_')]

st.write(st.session_state.users_list)

# using eval to evaluate string dataframe names
# st.write([eval(st.session_state.name) for st.session_state.name in st.session_state.users_list])

st.session_state.df_final = pd.concat([eval(st.session_state.name) for st.session_state.name 
                                       in st.session_state.users_list], 
                                       ignore_index=True)

st.table(st.session_state.df_final)

This whole logic works without streamlit but I am getting an error in this in streamlit, not sure what's wrong.

This logic/code without streamlit that works:

import pandas as pd

user_df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
user_df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})

df3 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})


users_list = ['user_df1','user_df2']

print([eval(name) for name in users_list])

df_final = pd.concat([eval(name) for name in users_list],ignore_index=True)

print('----df_final------')
print(df_final)

output:

[   A  B
0  1  3
1  2  4,    A  B
0  5  7
1  6  8]
----df_final------
   A  B
0  1  3
1  2  4
2  5  7
3  6  8
``

Solution

  • eval evaluates to the dataframes in the UPDATE because it's accessible in the local scope.

    The values of the st.session_state.users_list are attributes of st.session_state and should be accessed from the object.

    st.session_state.df_final = pd.concat(
        [eval(f"st.session_state.{name}") for name in st.session_state.users_list],
        ignore_index=True)
    

    I recommend using the getattr function over eval because it explicitly shows that attributes of the object are accessed .

    st.session_state.df_final = pd.concat(
        [getattr(st.session_state, name) for name in st.session_state.users_list],
        ignore_index=True)