Here is a code that plots all the Covid-19 test cases in India from January,2021 to July.
The records are on a daily basis but my scatterplot
shows only 17 points and the dates are overlapped. Is there any efficient way to do this?
df.to_numpy()
df1 = df[(df.date.str.endswith("21")) & (df.location == "India")]
plt.scatter(x=df1.date, y=df1.total_cases)
Here is the graph:
Thanks a lot
Let's first create a toy dataframe:
data = [[f"{i%30+1:02}/{6+i//30:02}/21", np.random.randint(i, i+10)**2]
for i in range(5, 50)]
df = pd.DataFrame(data, columns=["date", "total_cases"])
I recommend converting the dates to date format for easier processing in general and in particular for matplotlib to adapt its axes.
df.date = pd.to_datetime(df.date, format="%d/%m/%y")
Now matplotlib can process it and can choose the appropriate number of x-ticks. Since the labels are rather long, we can also rotate it.
plt.plot(df.date, df.total_cases, '.')
plt.xticks(rotation=25, ha="right")
plt.tight_layout()
plt.show()
Version 3.4. of Matplotlib comes with a handy way of displaying dates on the x-axis:
plt.rcParams['date.converter'] = 'concise'
plt.plot(df.date, df.total_cases, '.')
plt.tight_layout()
plt.show()