Search code examples
pythonpandasdataframeformatstring-formatting

How to use format() on pandas dataframe variable


I have the followings pandas dataframes

phreatic_level_l2n1_28w_df.head()
       Fecha    Hora    PORVL2N1  # PORVLxNx column change their name in each data frame
0   2012-01-12  01:37:47    0.65
1   2012-01-12  02:37:45    0.65
2   2012-01-12  03:37:50    0.64
3   2012-01-12  04:37:44    0.63
4   2012-01-12  05:37:45    0.61

And so, successively until have 25 data frames of type phreatic_level_l24n2_28w_df

.
.
.
phreatic_level_l24n2_28w_df.head()
       Fecha    Hora    PORVL24N2 # PORVLxNx column change their name in each data frame
0   2018-01-12  01:07:28    1.31
1   2018-01-12  02:07:28    1.31
2   2018-01-12  03:07:29    1.31
3   2018-01-12  04:07:27    1.31
4   2018-01-12  05:07:27    1.31

My objective is to iterate each record ( all data frames) to apply the following process

for i in range(1,25):
    if (i==2):
        # We turn to datetime the Fecha column values 
        phreatic_level_l{}n{}_28w_df['Fecha'].format(i,i-1) = pd.to_datetime(phreatic_level_l'{}'n'{}'_28w_df['Fecha'].format(i,i-1))
    .
    .
    # And so, successively until have 25 data frames  

But I have the following error, due to format() function, it should be applied on strings only and not to any variable name.

  File "<ipython-input-72-1f6ad7811399>", line 5
    phreatic_level_l{}n{}_28w_df['Fecha'].format(i,i-1) = pd.to_datetime(phreatic_level_l'{}'n'{}'_28w_df['Fecha'].format(i,i-1))
                    ^
SyntaxError: invalid syntax

Solution

  • str.format works on strings. You're trying to use it on a variable name.

    You could place your DataFrames in a dict and then reference them by string.

    dfs = {
        'phreatic_level_l1n0_28w_df': phreatic_level_l1n0_28w_df,
        'phreatic_level_l2n1_28w_df': phreatic_level_l1n0_28w_df,
        'phreatic_level_l3n2_28w_df': phreatic_level_l1n0_28w_df,
        ...
    }
    
    for name, df in dfs.items():
        df = pd.to_datetime(df['Fecha'])
    

    You can also access specific DataFrames like so dfs['phreatic_level_l3n2_28w_df'].

    Alternatively, you can store them in a list and iterate over them

    dfs = [
        phreatic_level_l1n0_28w_df,
        phreatic_level_l2n1_28w_df,
        phreatic_level_l3n2_28w_df,
        ...
    ]
    
    for df in dfs:
        df = pd.to_datetime(df['Fecha'])
    

    If you've stored them in order by the variable names you can access them in a much less verbose way, i.e. dfs[0].

    Finally, check out this is a great tutorial on str.format