Search code examples
pythonpandastextseabornsubplot

Add Text from Dataframe to Seaborn relplot


I'm trying to add text from a dataframe, specifically the percentage difference between 2 periods in a dataset, to a seaborn relplot with multiple sub plots.

I've created an executable example:

import pandas as pd
import numpy as np
import seaborn as sns

#create dataframe 
pd.set_option("display.max_columns", 200)
data = {'PTID': [11111, 11111, 11111, 11111, 22222, 22222, 22222, 22222, 33333, 33333, 33333, 33333, 44444, 44444, 44444, 44444, 55555, 55555, 55555, 55555],
        'Period' : ['Baseline','p1','p2','p3','Baseline','p1','p2','p3','Baseline','p1','p2','p3',  'Baseline','p1','p2','p3', 'Baseline','p1','p2','p3'] ,    
        'ALK PHOS': [46.0, 94.0, 21.0, 18.0, 56.0, 104.0, 31.0, 12.0, 50.0, 100.0, 33.0, 18.0, 46.0, 94.0, 21.0, 18.0, 46.0, 94.0, 21.0, 18.0],
        'AST (SGOT)': [33.0, 92.0, 19.0, 25.0, 33.0, 92.0, 21.0, 11.0, 33.0, 102.0, 18.0, 17.0, 23.0, 82.0, 13.0, 17.0, 23.0, 82.0, 13.0, 17.0],
        '% Saturation- Iron': [34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0, 34.0, 65.0, 10.0, 14.0]}
df = pd.DataFrame(data)

#melt into long format 
dfm = df.melt(id_vars=['PTID','Period'], var_name='Metric',value_name='Value')

#get average of data for period
dfg = dfm.groupby(['PTID','Period', 'Metric'])['Value'].mean().reset_index()

#drop periods in between, only keep first and last 
dfd = dfm[dfm['Period'].isin(['Baseline','p3'])]

#create dataframe with % difference between periods 
dfdg = dfd.groupby(['Metric', 'Period'])['Value'].mean().reset_index()
dfp = pd.pivot(dfdg, values='Value', index=['Metric'],
                    columns=['Period']).reset_index()
dfp['Difference'] = ((dfp['p3'] - dfp['Baseline'])/dfp['Baseline'])*100
dfp = dfp.round(2)

#plot subplots
p = sns.relplot(data=dfd, col='Metric', x='Period', y='Value', hue = 'PTID',kind='scatter', col_wrap=5, marker='o', palette='tab10',facet_kws={'sharey': False, 'sharex': True},)
p.map(sns.lineplot, 'Period', 'Value',  linestyle='--', color='gray', ci = None)

#add % change text to subplots 
for row in dfp['Difference']:
    print(row)
    p.fig.text(0.5,0.5, str(row) + "%",fontsize=12)

The problem I'm having, which you'll see if you run the code, is that it's not iterating through the subplots when adding the text and placing it all on the last plot. Where as, whet I'm trying to achieve is the % difference per

dfp["difference"] 

for the specific metric on each subplot.

I tried following this existing example of a similar problem - Adding text to each subplot in seaborn

but the code is not executable and am having trouble with the "zip" function.

This is how I tried implementing the "zip" function:

#add % change text to subplots 
for idx, row in zip(g.axes,dfp['Difference']):
    print(row)
    p.fig.text(0.5,0.5, str(row) + "%",fontsize=12)

I know the axes won't help me but I'm not sure how to access the subplots.


Solution

  • The problem was that ax was not being specified in the for loop to add the text. This issue was fixed by changing the for loop to:

    #add % change text to subplots 
    for ax, row in zip(g.axes.flat,dfp['Difference']):
        ax.text(0.75,0.75, str(row) + "%",fontsize=12,transform=ax.transAxes)
    

    Another important thing to note is that transform=ax.transAxes must be added so that the text is plotted relative to the axis.