Search code examples
pythonmatplotlibstacked-chart

matplotlib dataframe 2 column [dates, non-numerical-data] stacked bar chart defining attributes


Background:

I have managed to create the following graph, but I have difficulty with some of the elements

Disclaimer:

This graph below is what I want to achieve, however I would like integrate my questions into the graph If there is an alternative to obtaining a stacked graph with all the dates, please free to share the code with me.

enter image description here

Question:

How can I define the following:

  • Make the bars more wide
  • Make the y-axis integers
  • Change the date format (to %a %d/%b/%y ) of the x-axis
  • Define the chart size (400 by 800) (it's a little small as I think the dates are getting cut off)
  • Add a this is my chart title to the chart
  • Add labels (this is x axis, this is y-axis) to the x & y axis ?

MWE:

import datetime as dt
import mysql.connector
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime


mycursor.execute(query)
data = mycursor.fetchall()

df = pd.DataFrame(data, columns=['date', 'Operation'])

df['date'] = pd.to_datetime(df.date)

all_dates = pd.date_range('2020-05-01','2020-05-31', freq='D').date

(pd.crosstab(df.date,df.Operation)
.reindex(all_dates)
.plot.bar(stacked=True, color=COLOR_LIST)
)

filename = "\\TEST_month_of_{}.png".format("May").lower()
plt.savefig(CURRENT_DIRECTORY + filename)
print("\n\nGenerated: {}".format(CURRENT_DIRECTORY + filename))

Data set:

print(df) yields the following:

date          Operation
2020-05-07        A
2020-05-08        B
2020-05-08        A
2020-05-12        A
2020-05-12        A
2020-05-12        B
2020-05-13        C
2020-05-13        A
2020-05-13        B
2020-05-14        A
2020-05-19        B
2020-05-21        A
2020-05-25        A
2020-05-26        B
2020-05-26        C
2020-05-26        A
2020-05-26        A
2020-05-29        A

Solution

  • As for the date format, I couldn't do what I wanted because of the different locales. Also, the only way to make the width of the bar thicker is to thin out the number of data, so unnecessary lines are removed.

    import pandas as pd
    import numpy as np
    import io
    
    data = '''
    date Operation
    2020-05-07 A
    2020-05-08 B
    2020-05-08 A
    2020-05-12 A
    2020-05-12 A
    2020-05-12 B
    2020-05-13 C
    2020-05-13 A
    2020-05-13 B
    2020-05-14 A
    2020-05-19 B
    2020-05-21 A
    2020-05-25 A
    2020-05-26 B
    2020-05-26 C
    2020-05-26 A
    2020-05-26 A
    2020-05-29 A
    '''
    
    df = pd.read_csv(io.StringIO(data), sep='\s+')
    df['date'] = pd.to_datetime(df['date'])
    all_dates = pd.date_range('2020-05-01','2020-05-31', freq='D').date
    df2 = pd.crosstab(df.date,df.Operation).reindex(all_dates)
    
    import matplotlib.pyplot as plt
    
    fig = plt.figure(figsize=(4,8),dpi=100) # Define the chart size (400 by 800)
    
    df2.dropna(inplace=True) # Make the bars more wide
    
    ax = df2.plot.bar(stacked=True)
    
    ax.set_title('this is my chart') # Add a this is my chart title to the chart
    
    ax.set_xlabel('this is x-axis') # Add labels (this is x axis, this is y-axis) to the x & y axis ?
    ax.set_ylabel('this is y-axis')
    
    start, end = ax.get_ylim()
    ax.yaxis.set_ticks(np.arange(start, end, 1)) # Make the y-axis integers
    

    enter image description here