I have a panel dataset of countries with several indicators for each year-country observation. For simplicity, I am only reporting two indicators here: GHG and Air emissions
rs = np.random.RandomState(4)
pos = rs.randint(-1, 2, (4, 5)).cumsum(axis=1)
pos -= pos[:, 0, np.newaxis]
pos2 = rs.randint(-4, 3, (4, 5)).cumsum(axis=1)
pos2 -= pos[:, 0, np.newaxis]
year = np.tile(range(5), 4)
walk = np.repeat(range(4), 5)
df = pd.DataFrame(np.c_[pos.flat, pos2.flat, year, walk],
columns=["Air emissions", 'GHG', "year", "Country ID"])
I want to develop a visualization that shows the trend for each indicator in each country year. Each indicator is shown in a row, while countries are my columns. So far, this is what I have done for one indicator - Air Emission - but I would like to also show GHG trend (and the other indicators not reported here) and add them as row below Air emission: how?
sns.set(style="ticks")
# Initialize a grid of plots with an Axes for each walk
grid = sns.FacetGrid(df, col="Country ID", hue="year", palette="tab20c",
col_wrap=4, height=3)
# Draw a line plot to show the trajectory of each random walk
grid.map(plt.plot, "year", "Air emissions", marker="o")
# Adjust the arrangement of the plots
grid.fig.tight_layout(w_pad=1)
how can I do it? Looping? But wouldn't that overwrite the graphs?
thanks!
You'll want to encode your variables that you want in the FacetGrid rows as a column with a separate column containing the values for each variable. Probably not the best explanation, but it would look like this:
year Country ID variable value
0 0 0 Air emissions 0
1 0 0 GHG 0
2 0 1 Air emissions 0
3 0 1 GHG -3
4 0 2 Air emissions 0
5 0 2 GHG -2
...
Then you can set your FacetGrid parameter row
to 'variable'
(you'll also have to remove col_wrap
):
grid = sns.FacetGrid(x, row='variable', col="Country ID", hue="year", palette="tab20c", height=3)
grid.map(plt.plot, "year", "value", marker="o")
You can reformat your dataframe using pivot_table
:
df = df.pivot_table(index=['year', 'Country ID'], values=['Air emissions', 'GHG']).stack().reset_index()
df.columns = ['year', 'Country ID', 'variable', 'value']