I'm trying to add text from a dataframe, specifically the percentage difference between 2 periods in a dataset, to a seaborn relplot with multiple sub plots.
I've created an executable example:
import pandas as pd
import numpy as np
import seaborn as sns
#create dataframe
pd.set_option("display.max_columns", 200)
data = {'PTID': [11111, 11111, 11111, 11111, 22222, 22222, 22222, 22222, 33333, 33333, 33333, 33333, 44444, 44444, 44444, 44444, 55555, 55555, 55555, 55555],
'Period' : ['Baseline','p1','p2','p3','Baseline','p1','p2','p3','Baseline','p1','p2','p3', 'Baseline','p1','p2','p3', 'Baseline','p1','p2','p3'] ,
'ALK PHOS': [46.0, 94.0, 21.0, 18.0, 56.0, 104.0, 31.0, 12.0, 50.0, 100.0, 33.0, 18.0, 46.0, 94.0, 21.0, 18.0, 46.0, 94.0, 21.0, 18.0],
'AST (SGOT)': [33.0, 92.0, 19.0, 25.0, 33.0, 92.0, 21.0, 11.0, 33.0, 102.0, 18.0, 17.0, 23.0, 82.0, 13.0, 17.0, 23.0, 82.0, 13.0, 17.0],
'% Saturation- Iron': [34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0]}
df = pd.DataFrame(data)
#melt into long format
dfm = df.melt(id_vars=['PTID','Period'], var_name='Metric',value_name='Value')
#get average of data for period
dfg = dfm.groupby(['PTID','Period', 'Metric'])['Value'].mean().reset_index()
#drop periods in between, only keep first and last
dfd = dfm[dfm['Period'].isin(['Baseline','p3'])]
#create dataframe with % difference between periods
dfdg = dfd.groupby(['Metric', 'Period'])['Value'].mean().reset_index()
dfp = pd.pivot(dfdg, values='Value', index=['Metric'],
columns=['Period']).reset_index()
dfp['Difference'] = ((dfp['p3'] - dfp['Baseline'])/dfp['Baseline'])*100
dfp = dfp.round(2)
#plot subplots
p = sns.relplot(data=dfd, col='Metric', x='Period', y='Value', hue = 'PTID',kind='scatter', col_wrap=5, marker='o', palette='tab10',facet_kws={'sharey': False, 'sharex': True},)
p.map(sns.lineplot, 'Period', 'Value', linestyle='--', color='gray', ci = None)
#add % change text to subplots
for row in dfp['Difference']:
print(row)
p.fig.text(0.5,0.5, str(row) + "%",fontsize=12)
The problem I'm having, which you'll see if you run the code, is that it's not iterating through the subplots when adding the text and placing it all on the last plot. Where as, whet I'm trying to achieve is the % difference per
dfp["difference"]
for the specific metric on each subplot.
I tried following this existing example of a similar problem - Adding text to each subplot in seaborn
but the code is not executable and am having trouble with the "zip" function.
This is how I tried implementing the "zip" function:
#add % change text to subplots
for idx, row in zip(g.axes,dfp['Difference']):
print(row)
p.fig.text(0.5,0.5, str(row) + "%",fontsize=12)
I know the axes won't help me but I'm not sure how to access the subplots.
The problem was that ax
was not being specified in the for loop to add the text. This issue was fixed by changing the for loop to:
#add % change text to subplots
for ax, row in zip(g.axes.flat,dfp['Difference']):
ax.text(0.75,0.75, str(row) + "%",fontsize=12,transform=ax.transAxes)
Another important thing to note is that transform=ax.transAxes
must be added so that the text is plotted relative to the axis.