I used the example for Discrete distribution as horizontal bar chart example on matplotlib Discrete distribution as horizontal bar chart to create a chart showing share of the vote in Shropshire elections 2017.
However, because I did not know how to manipulate the data I had to manually enter my data in the program which is clearly down to my own ignorance.
I have the relevant data in a CSV file and can therefore load it as a dataframe.
I wanted advice as to how to change the form of the data so it resembles the input for this chart.
I am not sure what it is but seems possibly a dictionary type with key and value:
import pandas as pd
import matplotlib.pyplot as plt
category_names = ['Labour', 'LD', 'Indep', 'Green', 'Tory']
results = {'Abbey': [16, 56, 4,0, 24],
'Albrighton': [0, 0, 32, 0, 68],
'Alveley & Claverley': [0, 25, 0, 0, 75],
'Bagley': [30, 30, 0, 0, 40],
'Battlefield': [34, 0, 0, 9, 57],
'Bayston Hill, Column & Sutton': [53, 4, 3, 7, 33],
'Belle Vue': [43,28,0,5,24]}
# setup dataframe using the dict provided in the OP
df = pd.DataFrame(results, index=category_names)
# display(df)
Abbey Albrighton Alveley & Claverley Bagley Battlefield Bayston Hill, Column & Sutton Belle Vue
Labour 16 0 0 30 34 53 43
LD 56 0 25 30 0 4 28
Indep 4 32 0 0 0 3 0
Green 0 0 0 0 9 7 5
Tory 24 68 75 40 57 33 24
I am trying to get the data to be formatted like this directly from the csv file when entered as a pandas dataframe.
Have tried the values method and the to_dict
method and while they get data looking similar they are not quite correct.
df
from the OP.pandas.DataFrame.plot
with the parameter stacked=True
'Party'
as the y-axismatplotlib
from version 3.4.2matplotlib.pyplot.bar_label
.bar_label
.pandas 1.3.2
, python 3.8
1., and matplotlib 3.4.2
1.
labels = [f'{v.get_width():.0f}' if v.get_width() > 0 else '' for v in c ]
without using the assignment expression (:=
).get_height()
for vertical bars.ax = df.plot.barh(stacked=True, cmap='tab10', figsize=(16, 10))
for c in ax.containers:
# format the number of decimal places and replace 0 with an empty string
labels = [f'{w:.0f}' if (w := v.get_width()) > 0 else '' for v in c ]
ax.bar_label(c, labels=labels, label_type='center')
matplotlib
before version 3.4.2.patch
components in a loop, and then only plot annotations for values greater than 0.# plot
ax = df.plot.barh(stacked=True, cmap='tab10', figsize=(16, 10))
# annotations:
for p in ax.patches:
left, bottom, width, height = p.get_bbox().bounds
if width > 0:
ax.annotate(f'{width:0.0f}', xy=(left+width/2, bottom+height/2), ha='center', va='center')
'Ward'
as the y-axispandas.DataFrame.T
to swap the Index
and Columns
'Ward'
will now be the index and 'Party'
will be the columns# transpose df from the OP so Party is the in the columns and Ward is the index
dft = df.T
# display(dft)
Labour LD Indep Green Tory
Abbey 16 56 4 0 24
Albrighton 0 0 32 0 68
Alveley & Claverley 0 25 0 0 75
Bagley 30 30 0 0 40
Battlefield 34 0 0 9 57
Bayston Hill, Column & Sutton 53 4 3 7 33
Belle Vue 43 28 0 5 24
matplotlib
from version 3.4.2# plot
ax = df.T.plot.barh(stacked=True, figsize=(16, 10))
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
# annotations:
for c in ax.containers:
# format the number of decimal places and replace 0 with an empty string
labels = [f'{w:.0f}' if (w := v.get_width()) > 0 else '' for v in c ]
ax.bar_label(c, labels=labels, label_type='center')
matplotlib
before version 3.4.2# plot
ax = dft.plot.barh(stacked=True, figsize=(16, 10))
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
# annotations:
for p in ax.patches:
left, bottom, width, height = p.get_bbox().bounds
if width > 0:
ax.annotate(f'{width:0.0f}', xy=(left+width/2, bottom+height/2), ha='center', va='center')