Search code examples
pythonpandasseabornfacet-grid

Setting columns of DataFrame as rows of FacetGrid figure


I have a panel dataset of countries with several indicators for each year-country observation. For simplicity, I am only reporting two indicators here: GHG and Air emissions

rs = np.random.RandomState(4)
pos = rs.randint(-1, 2, (4, 5)).cumsum(axis=1)
pos -= pos[:, 0, np.newaxis]
pos2 = rs.randint(-4, 3, (4, 5)).cumsum(axis=1)
pos2 -= pos[:, 0, np.newaxis]
year = np.tile(range(5), 4)
walk = np.repeat(range(4), 5)

df = pd.DataFrame(np.c_[pos.flat, pos2.flat, year, walk],
                  columns=["Air emissions", 'GHG', "year", "Country ID"])

I want to develop a visualization that shows the trend for each indicator in each country year. Each indicator is shown in a row, while countries are my columns. So far, this is what I have done for one indicator - Air Emission - but I would like to also show GHG trend (and the other indicators not reported here) and add them as row below Air emission: how?

sns.set(style="ticks")

    # Initialize a grid of plots with an Axes for each walk
grid = sns.FacetGrid(df, col="Country ID", hue="year", palette="tab20c",
                         col_wrap=4, height=3)

    # Draw a line plot to show the trajectory of each random walk
grid.map(plt.plot, "year", "Air emissions",  marker="o")

    # Adjust the arrangement of the plots
grid.fig.tight_layout(w_pad=1)

how can I do it? Looping? But wouldn't that overwrite the graphs?

thanks!


Solution

  • You'll want to encode your variables that you want in the FacetGrid rows as a column with a separate column containing the values for each variable. Probably not the best explanation, but it would look like this:

        year  Country ID       variable  value
    0      0           0  Air emissions      0
    1      0           0            GHG      0
    2      0           1  Air emissions      0
    3      0           1            GHG     -3
    4      0           2  Air emissions      0
    5      0           2            GHG     -2
    ...
    

    Then you can set your FacetGrid parameter row to 'variable' (you'll also have to remove col_wrap):

    grid = sns.FacetGrid(x, row='variable', col="Country ID", hue="year", palette="tab20c", height=3)
    grid.map(plt.plot, "year", "value",  marker="o")
    

    You can reformat your dataframe using pivot_table:

    df = df.pivot_table(index=['year', 'Country ID'], values=['Air emissions', 'GHG']).stack().reset_index()
    df.columns = ['year', 'Country ID', 'variable', 'value']