1. Problem
I wanted to know the change in transaction prices over time, so I made a scatter plot graph.
The big picture was drawn roughly, but the minor issue was not solved. It is the display interval of the x-axis label.
The results I made for the first time and the part of code for them are as follows.
import matplotlib.pyplot as plt
import matplotlib
import matplotlib.dates as mdates
from datetime import datetime
import pandas as pd
df = pd.read_excel('data.xlsx')
df = df.loc[df['계약면적(㎡)'] > 33]
# Data in the form of 202109 was made into 2021-09-01
# and then make date data in the form of 202109 using dt.strftime ('%Y%m').
df['계약년월'] = df['계약년월'].astype(str)
df['계약년월'] = df['계약년월'].str[0:4] + '-' + df['계약년월'].str[4:6] + '-01'
df['계약년월'] = pd.to_datetime(df['계약년월'])
df['계약년월'] = df['계약년월'].dt.strftime('%Y%m')
# Graph
plt.figure(figsize=(30,10))
ax = plt.gca()
yticks = [100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000]
ylabels = [10, 20, 30, 40, 50, 60, 70, 80]
plt.yticks(yticks, labels = ylabels)
ax.xaxis.set_major_locator(mdates.MonthLocator())
plt.scatter(df['계약년월'], df['면적당 금액(원)'])
plt.xlabel('계약년월')
plt.ylabel('면적당 금액(원/㎡)')
plt.savefig('Graph.jpg')
As you can see, the label of xticks is displayed as 201101 201308 201512 201807 202101.
I would like to mark this every end of each year in the way 20111212 201312 201412 201521 201612 201712 201812 201912 202012 and so on.
2. What I've tried
Since yticks were easily changed at my disposal, I tried applying the same method to xticks. The code for it is as follows.
plt.figure(figsize=(30,10))
ax = plt.gca()
yticks = [100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000]
ylabels = [10, 20, 30, 40, 50, 60, 70, 80]
plt.yticks(yticks, labels = ylabels)
# xticks I want to show.
xticks = ['201112', '201212', '201312', '201412', '201512', '201612', '201712', '201812', '201912', '202012']
# For the above list, it was converted into date data in the form of '%Y%m'.
xticks = [datetime.strptime(x, '%Y%m') for x in xticks]
# xlabels displayed in the graph
xlabels = ['2011y-end', '2012y-end', '2013y-end', '2014y-end', '2015y-end', '2016y-end', '2017y-end', '2018y-end', '2019y-end', '2020y-end']
plt.xticks(xticks, labels = xticks)
ax.xaxis.set_major_locator(mdates.MonthLocator(bymonth=None, interval=2, tz=None))
plt.scatter(df['계약년월'], df['면적당 금액(원)'])
plt.xlabel('계약년월')
plt.ylabel('면적당 금액(원/㎡)')
plt.savefig('Graph.jpg')
However, the results were disastrous.
Perhaps there was a problem in the process of touching the tick, and this result came after 201212 followed by 201213 instead of 201301.
I was worried about this, so I used strftime and strptime to convert both the 'data' and 'ticks list' into date form('%Y%m'), but I wonder why it didn't apply as I intended.
Please understand if there is an inefficient code due to be not used to Python yet, and I would appreciate it if you could let me solve the problem.
from matplotlib import dates as mdates
plt.xlim(datetime.datetime(2011,12),datetime.datetime(2020,12))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%d-%b\n%Y'))
You can set x-axis major formatted according to your wish, here I used %d
for the day, %b
for month, \n
for new line and %Y
represents year.