Okay so I've been working on this temperature timeseries data and I want to know if there is a way to control the dates on the x-axis. Matplotlib randomly select Dates for the x-axis. I mean I wanted to show the exact date beneath the anomaly and then later on name it as EQ while setting the other values as the number of days before and after the EQ.
for example the x-axis may become [-25, -20, -15, -10, -5, "EQ", 5, 10]
Here's the data that I am currently working with:
Here's what I want to achieve.
and here's the code that I've written so far!
fig, ax = plt.subplots(figsize=(8,4))
sns.scatterplot(data =at_day, x=at_day.index, y="AT_Day_Time_Values", hue=at_day["Legend"], palette="bright", s = 80);
sns.lineplot(at_day["AT_Day_Time_Values"], linewidth=1.5, linestyle = "--", color = "black", ax = ax, label="AT (Day)")
sns.rugplot(data =at_day, x=at_day.index, y="AT_Day_Time_Values", hue=at_day["Legend"], ax= ax)
ax.set_xticks(at_day.index.date)
ax.xaxis.set_major_locator(mdates.DayLocator(interval=5))
ax.set(ylim=[295, 307], ylabel="K");
ax.grid(axis="y")
Please ignore other things like hue and stuff for now I am only interested in learning to control the dates on xaxis
You can automatically generate the ticks with this approach. I assumed you always wanted a difference of 5 days, so here you can also adjust this later if you want:
import pandas as pd
import numpy as np
%matplotlib notebook
import matplotlib.pyplot as plt
df = pd.read_csv("air_temp.csv")
df["Dates"] = pd.to_datetime(df["Dates"])
# get limits for prompt
minDate = df["Dates"].min()
maxDate = df["Dates"].max()
EQ = "" # init
while EQ not in df["Dates"].values: # check value, do until...
# prompt user to input dates between two values, liek t his yyyy-mm-dd
EQ = input(f"Choose a date between {minDate.date()} and {maxDate.date()}: ")
EQ = pd.to_datetime(EQ) # convert to date
df['DatesDiff'] = df['Dates'] - EQ # get time difference
plt.figure() # generate a figure
plt.plot(df['DatesDiff'].dt.days, df['AT_Day_Time_Values']) # plot
minVal = np.floor(min(df["DatesDiff"].dt.days) / 5) * 5 # get suitable limit in negative
maxVal = np.ceil(max(df["DatesDiff"].dt.days) / 5) * 5 # get suitable limit in positive
ticks = np.arange(minVal, maxVal + 1, 5) # define the ticks you want to set, they will include 0 and have 5 spacing
labels = [str(int(tick)) if tick != 0 else 'EQ' for tick in ticks] # generate labels as string, with "EQ" at 0
plt.xticks(ticks, labels) # set the ticks
plt.title("EQ = "+str(EQ.date())) # set title for OP
This is with EQ: 2022-08-05
I guess you would like to autoamtically identify the peak, so here is the way to do it:
df = pd.read_csv("air_temp.csv")
df["Dates"] = pd.to_datetime(df["Dates"])
# get limits for prompt
minDate = df["Dates"].min()
maxDate = df["Dates"].max()
EQ = df["Dates"].loc[df["AT_Day_Time_Values"].idxmax()] # get maximum temp
df["DatesDiff"] = df["Dates"] - EQ # get time difference
plt.figure() # generate a figure
plt.plot(df["DatesDiff"].dt.days, df["AT_Day_Time_Values"]) # plot
minVal = np.floor(min(df["DatesDiff"].dt.days) / 5) * 5 # get suitable limit in negative
maxVal = np.ceil(max(df["DatesDiff"].dt.days) / 5) * 5 # get suitable limit in positive
ticks = np.arange(minVal, maxVal + 1, 5) # define the ticks you want to set, they will include 0
labels = [str(int(tick)) if tick != 0 else 'EQ' for tick in ticks] # generate labels as string, with "EQ" at 0
plt.xticks(ticks, labels) # set the ticks
plt.title("EQ = "+str(EQ.date())) # set title for OP
I guess since it is max this time, this was easy. Detecting an anomaly would get more complicated when the data becomes even noisier. Here's the result: