Hi, I'm trying to make a histogram with above table, and below is my coding.
def histograms(t):
salaries = t.column('Salary')
salary_bins = np.arange(min(salaries), max(salaries)+1000, 1000)
t.hist('Salary', bins=salary_bins, unit='$')
histograms(full_data)
But it's not showing properly. Can you help me?
The bins argument in a histogram specifies the number of bins into which the data will be evenly distributed.
Let's say you have a sample dataframe of salaries like this:
import pandas as pd
sample_dataframe = pd.DataFrame({'name':['joe','jill','martin','emily','frank','john','sue','sally','sam'],
'salary':[105324,65002,98314,24480,55000,62000,75000,79000,32000]})
#output:
name salary
0 joe 105324
1 jill 65002
2 martin 98314
3 emily 24480
4 frank 55000
5 john 62000
6 sue 75000
7 sally 79000
8 sam 32000
If you want to plot a histogram where the salaries will be distributed in 10 bins and you want to stick with your function, you can do:
import matplotlib.pyplot as plt
def histograms(t):
plt.hist(t.salary, bins = 10, color = 'orange', edgecolor = 'black')
plt.xlabel('Salary')
plt.ylabel('Count')
plt.show()
histograms(sample_dataframe)
If you want the x-axis ticks to reflect the boundaries of the 10 bins, you can add this line:
import numpy as np
plt.xticks(np.linspace(min(t.salary), max(t.salary), 11), rotation = 45)
Finally to show the y-ticks as integers, you add these lines:
from matplotlib.ticker import MaxNLocator
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
The final function looks like this:
def histograms(t):
plt.hist(t.salary, bins = 10, color = 'orange', edgecolor = 'black')
plt.xlabel('Salary')
plt.ylabel('Count')
plt.gca().yaxis.set_major_locator(MaxNLocator(integer=True))
plt.xticks(np.linspace(min(t.salary), max(t.salary), 11), rotation = 45)
plt.show()