Search code examples
pythonpandasmatplotlibgroup-bysubplot

To plot variable number of plots with variable number of graphs from Pandas grouped by some columns


I've been dealing with Python for the last few months and it still a little bit irritate me :)

But hope someone can explain how to plot some subplots from dataframe grouped by 2 columns using Matplotlib. Or give a link where I can find the information about this.

There is the example of dataframe:

name N coeff X1 Y1
0 F2 F2_1 1 0 0.56451
1 F2 F2_1 5 0 0.45005
2 F2 F2_2 1 25 0.53641
3 F2 F2_2 5 25 0.53641
4 F2 F2_3 1 55 0.45067
5 F2 F2_3 5 55 0.38828
6 F2 F2_4 1 85 0.33279
7 F2 F2_4 5 85 0.35315
8 F2 F2_5 1 95 0.2756
9 F2 F2_5 5 95 0.32571
10 F2 F2_6 1 120 0.159
11 F2 F2_6 5 120 0.27583
12 F2 F2_7 1 150 0.05648
13 F2 F2_7 5 150 0.21128
14 F3 F3_1 1 0 0.42757
15 F3 F3_1 5 0 0.32409
16 F3 F3_2 1 25 0.36033
17 F3 F3_2 5 25 0.2701
18 F3 F3_3 1 55 0.2161
19 F3 F3_3 5 55 0.17014
20 F3 F3_4 1 85 0.08191
21 F3 F3_4 5 85 0.10512
22 F3 F3_5 1 95 0.04468
23 F3 F3_5 5 95 0.09686
24 F3 F3_6 1 120 0.07025
25 F3 F3_6 5 120 0.033
26 F3 F3_7 1 150 0.10689
27 F3 F3_7 5 150 0.01334

I group it by columns Name and Coeff and need to plot sublots for every Name with graphs for every Coeff. In this case it has to be 2 subplots for "F2" and "F3" with 2 graphs for coeff 1 and 5.

There is no problem when I group it by 1 column. I don't understand how it works with grouping by some columns


Solution

  • Explaination:

    • When you group a DataFrame by a single column, it means you create separate groups based on one category (e.g., name).
    • when you group by two columns, you are creating subgroups within the main groups.

    To visualize this properly using Matplotlib, follow these steps:

    • Create subplots for each unique value in the first grouping column (e.g., name).
    • Plot separate lines in each subplot for the second grouping column (e.g., coeff).
    • Use loops to iterate through the groups.

    Code:

    import pandas as pd
    import matplotlib.pyplot as plt
    
    # DataFrame
    data = {
        "name": ["F2"] * 14 + ["F3"] * 14,
        "N": ["F2_1", "F2_1", "F2_2", "F2_2", "F2_3", "F2_3", "F2_4", "F2_4", "F2_5", "F2_5", "F2_6", "F2_6", "F2_7", "F2_7",
              "F3_1", "F3_1", "F3_2", "F3_2", "F3_3", "F3_3", "F3_4", "F3_4", "F3_5", "F3_5", "F3_6", "F3_6", "F3_7", "F3_7"],
        "coeff": [1, 5] * 7 + [1, 5] * 7,
        "X1": [0, 0, 25, 25, 55, 55, 85, 85, 95, 95, 120, 120, 150, 150,
               0, 0, 25, 25, 55, 55, 85, 85, 95, 95, 120, 120, 150, 150],
        "Y1": [0.56451, 0.45005, 0.53641, 0.53641, 0.45067, 0.38828, 0.33279, 0.35315, 0.2756, 0.32571, 0.159, 0.27583, 0.05648, 0.21128,
               0.42757, 0.32409, 0.36033, 0.2701, 0.2161, 0.17014, 0.08191, 0.10512, 0.04468, 0.09686, 0.07025, 0.033, 0.10689, 0.01334]
    }
    
    df = pd.DataFrame(data)
    
    # Create a figure with subplots for each unique 'name'
    unique_names = df["name"].unique()
    fig, axes = plt.subplots(len(unique_names), 1, figsize=(8, 6), sharex=True)
    
    # Ensure axes is iterable (for a single subplot case)
    if len(unique_names) == 1:
        axes = [axes]
    
    # Iterate over unique names and plot for each coeff
    for ax, name in zip(axes, unique_names):
        subset = df[df["name"] == name]
        for coeff, group in subset.groupby("coeff"):
            ax.plot(group["X1"], group["Y1"], marker="o", label=f"Coeff {coeff}")
    
        ax.set_title(f"Subplot for {name}")
        ax.legend()
        ax.set_ylabel("Y1")
    
    plt.xlabel("X1")
    plt.tight_layout()
    plt.show()
    

    Output:

    subplot