Search code examples
pythondataframefor-loopstring-formatting

I am wishing to produce a series of smaller dataframes from a single large dataframe in python while naming the dataframes with the filter


I have a large dataframe called dfe filled with scientific information. In my first column ('reaction') there are three different string variables, say a,b,c. I wish to split this data frame into three dataframes dfa, dfb,dfc. I have a list variable called react2 with the variables a,b,c.

Here is my code for the problem:

for i in react2:
    df{}.format(i) = dfe[dfe['reaction'] = i ]

I then get an error of:

 df{}.format(i) = dfe[dfe['reaction'] = i ]
   ^
 SyntaxError: invalid syntax

Solution

  • The most sensible thing would be to store them in a dictionary:

    df_dict = {}
    for i in react2:
        df_dict[i] = dfe[dfe['reaction'] == i]
    

    You can put this onto a single line using a dictionary comprehension:

    df_dict = {i : dfe[dfe['reaction'] == i] for i in react2}