Search code examples
pythonmatplotlibdatetimeseabornlmplot

format x-axis (dates) in sns.lmplot()


I have daily data that I need to plot with sns.lmplot().

The data has the following structure:

df = pd.DataFrame(columns=['date', 'origin', 'group', 'value'],
                  data = [['2001-01-01', "Peter", "A", 1.0],
                          ['2011-01-01', "Peter", "A", 1.1],
                          ['2011-01-02', "Peter", "B", 1.2],
                          ['2012-01-03', "Peter", "A", 1.3],
                          ['2012-01-01', "Peter", "B", 1.4],
                          ['2013-01-02', "Peter", "A", 1.5],
                          ['2013-01-03', "Peter", "B", 1.6],
                          ['2021-01-01', "Peter", "A", 1.7]])

I now want to plot the data with sns.lmplot() for monthly averages (my original data is more fine-grained than the toy data) and using the hue for group-column. For this, I aggregate by month:

df['date'] = pd.to_datetime(df['date']).dt.strftime('%Y%M').astype(int)
df = df.groupby(['date', 'origin', 'group']).agg(['mean'])
df.columns = ["_".join(pair) for pair in df.columns]  # reset col multi-index
df = df.reset_index()  # reset index

Then I plot the data:

sns.lmplot(data=df, x="date", y="value", hue="group",
           ci=None, truncate=False, scatter_kws={"s": 1}, lowess=True, height=6, aspect=1.25)
plt.title(f"Title.")
plt.ylabel("Value")
plt.show()

This works fine but the dates are messy. I would like them to be displayed as dates rather than ints.

I have found this question but I want the grouped plot, so I cannot use regplot, and the code plt.xticks(fake_dates) (following this answer) gives TypeError: object of type 'FuncFormatter' has no len().

Does someone have an idea how to address this?


Solution

    • In order to convert the values on the x-axis back to dates, the values in the 'date' column should be converted to ordinal values.
    • When iterating through the axes to configure the xtick format, the labels can be configured to a custom string format with .strftime
      • new_labels = [date.fromordinal(int(label)).strftime("%b %Y") for label in labels]
    • Tested in python 3.8.12, pandas 1.3.3, matplotlib 3.4.3, seaborn 0.11.2
    from datetime import date
    
    # convert the date column to ordinal or create a new column
    df['date'] = pd.to_datetime(df['date']).apply(lambda date: date.toordinal())
    
    df = df.groupby(['date', 'origin', 'group']).agg(['mean'])
    df.columns = ["_".join(pair) for pair in df.columns]  # reset col multi-index
    df = df.reset_index()  # reset index
    
    # plot
    g = sns.lmplot(data=df, x="date", y="value_mean", hue="group", ci=None, truncate=False, scatter_kws={"s": 1}, lowess=True, height=6, aspect=1.5)
    
    # iterate through the axes of the figure-level plot
    for ax in g.axes.flat:
        labels = ax.get_xticks() # get x labels
        new_labels = [date.fromordinal(int(label)) for label in labels] # convert ordinal back to datetime
        ax.set_xticks(labels)
        ax.set_xticklabels(new_labels, rotation=0) # set new labels
    
    plt.title("Title")
    plt.ylabel("Value")
    plt.show()
    

    enter image description here