Search code examples
pythonpandasdataframeloops

Loop a number in the name of the DF and filtering


I want to create a loop from 1 to 8 to replace the #7 in each DF name and filtering of this code. How can I do this?

So far, my code is as follows:

df_sample_7 = df[df["quality"]==7]
df_sample_7_mean = df_sample_7.describe().T
df_sample_7_mean = df_sample_7_mean.drop(columns=["count","std","min","25%","75%","max"], axis=1)

Solution

  • Use a list.

    df_samples = []
    df_sample_means = []
    for i in range(8):
        df_samples.append(df[df["quality"]==i+1])
        df_sample_means.append(df_samples[-1].describe().T)
        df_sample_means[-1] = df_sample_means[-1].drop(columns=["count","std","min","25%","75%","max"], axis=1)
    

    The only thing to be careful about is that the items are numbers 0 to 7, not 1 to 8. If that REALLY bothers you, then use a dictionary instead:

    df_samples = {}
    df_sample_means = {}
    for i in (1,2,3,4,5,6,7,8):
        df_samples[i] = df[df["quality"]==i]
        df_sample_means[i] = df_samples[i].describe().T
        df_sample_means[i] = df_sample_means[i].drop(columns=["count","std","min","25%","75%","max"], axis=1)