Search code examples
pythonmatplotlibplotseabornxticks

How to make xticks in matplotlib evenly spaced despite their value, based on groupby?


I am trying to plot the average return by age for different companies. I want to have the age variable evenly spaced, though the stepsize increases and is not evenly distributed. My goal looks like this:Target of my plot

fig, ax = plt.subplots(figsize=(10,6), sharex=True, sharey=True)

ax.plot(ipo_cut[["IR", "Age", "Timespan"]][(ipo_cut["Timespan"] == "1980-1989") & (ipo_cut["Age"] >= 2)].groupby("Age").mean(), color="r")
ax2 = ax.twinx()
ax2.plot(ipo_cut[["IR", "Age", "Timespan"]][(ipo_cut["Timespan"] == "1990-1998") & (ipo_cut["Age"] >= 2)].groupby("Age").mean(), color = "g")
ax3 = ax.twinx()
ax3.plot(ipo_cut[["IR", "Age", "Timespan"]][(ipo_cut["Timespan"] == "1999-2000") & (ipo_cut["Age"] >= 2)].groupby("Age").mean(), color="grey")
ax4 = ax.twinx()
ax4.plot(ipo_cut[["IR", "Age", "Timespan"]][(ipo_cut["Timespan"] == "2001-2003") & (ipo_cut["Age"] >= 2)].groupby("Age").mean())

ax.set(title ="Average First-day Returns by Age of Firm at Time of IPO",
      xlabel = "Age",
      ylabel = "Average First-Day Return in %")
ax.set_ylim([0,1])
ax2.set_ylim([0,1])
ax3.set_ylim([0,1])
ax4.set_ylim([0,1])
ax.set_xlim([0,70])
ax2.set_xlim([0,70])
ax3.set_xlim([0,70])
ax4.set_xlim([0,70])
ax2.get_yaxis().set_visible(False)
ax3.get_yaxis().set_visible(False)
ax4.get_yaxis().set_visible(False)

plt.xticks([0,1,2,3,4,5,6,7,8,9,10, 11,12,13,15,17,20,30,40,50,60,70])
plt.setp(ax.get_xticklabels(), rotation = 90)

fig.show()

and the output looks like the following:My output plot

Based on other posts I could display only the desired years of "Age", but the axis is unfortunately not evenly distributed. I feel like I am really close to a solution, but could not find an answer by myself.

A solution in seaborn would also be very helpful to me.

EDIT with some additional information: The shape of the x-vector and the data does not match. I have filtered my dataframe only for the relevant ages, but some information is missing. Here is a minimal reproducible example:

import matplotlib.pyplot as py
import numpy as np
    
#create dataframe
x = np.arange(0,11)
df = pd.DataFrame(data = np.random.rand(len(x)), index = x, columns=["IR"])
df.index.name = "Age"
df.loc[[0, 9], "IR"] = np.nan
df.dropna(inplace = True)

# open figure
fig, ax = plt.subplots(figsize=(10,6))
# create x-values
x = [0,1,2,5,7,9,10]
# plot
ax.plot(x, df, color="r")
ax.set(
    title ="Average First-day Returns by Age of Firm at Time of IPO",
    xlabel = "Age",
    ylabel = "Average First-Day Return in %",
    ylim = [0,1],
    xlim = [0,10])
ax.tick_params(axis='x', labelrotation=90) 
fig.show()

My goal is to plot the whole dataframe without gaps. So the idea is that I have the line plot in the background with the needed x-values on the x-axis. I hope this information helps.

SOLUTION:

import matplotlib.pyplot as py
import numpy as np
    
#create dataframe
x = np.arange(0,10)
df = pd.DataFrame(data = np.random.rand(len(x)), index = x, columns=["IR"])
df.index.name = "Age"
#create nan in sample
df.loc[[0, 9], "IR"] = np.nan
# slice data for unbroken line (solution by max)
lg = df["IR"].isna()
# create ticks for even distribution
n = 10
a = np.arange(n)
# open figure
fig, ax = plt.subplots(figsize=(10,6))
# create x-values fir tick labels
x = (0,5,10,13,15,24,30,40,55,70)
# plot
ax.plot(a[~lg], df[~lg], color="r")
ax.set(
    title ="Average First-day Returns by Age of Firm at Time of IPO",
    xlabel = "Age",
    ylabel = "Average First-Day Return in %",
    ylim = [0,1],
    xlim = [0,10])
ax.xaxis.set_ticks(a) #set the ticks to be a
ax.xaxis.set_ticklabels(x) # change the ticks' names to x
ax.tick_params(axis='x', labelrotation=90) 
fig.show()

Thanks to the input of max and "How to make ticks evenly spaced despite their value?" I could find a solution to my problem. If your original dataframe has x-values with stepsize other than one I suggest to simply reset_index().


Solution

  • I recommend to create the desired x-vector right at the beginning and plot the data over it. Then you can let matplotlib do the rest. The other way around, you will artificially create an evenly spaced plot (if you do not provide an x-value, matplotlib assumes that the data is a sequence of points with step-size 1), you are just change the appearance of the x-axis rather than of the graph itself.

    import matplotlib.pyplot as py
    import numpy as np
    
    # open figure
    fig, ax = plt.subplots(figsize=(10,6))
    # create x-values
    x = [0,1,2,3,4,5,6,7,8,9,10, 11,12,13,15,17,20,30,40,50,60,70]
    # create (random) y-values
    y = np.random.rand(len(x))
    # plot
    ax.plot(x, y, color="r")
    ax.set(
        title ="Average First-day Returns by Age of Firm at Time of IPO",
        xlabel = "Age",
        ylabel = "Average First-Day Return in %",
        ylim = [0,1],
        xlim = [0,70])
    ax.tick_params(axis='x', labelrotation=90) 
    fig.show()
    

    enter image description here

    I stripped you code and created some artificial dummy data. Have a look on: How to create a minimal reproducible example.