I am trying to draw a frequency line plot using matplotlib
with the x-axis being the amount (loan_amount
) and the y-axis the number of occurrences of that amount (loan_count
) but I am not sure how to use the number of occurrences as y-values.
I'd think the general code has to start similar to this but am not sure what y
should be and how to complete it:
con = sqlite3.connect('databaseTest.db')
cur = con.cursor()
cur.execute("SELECT LOAN_AMOUNT FROM funded")
loan_amount = cur.fetchall()
loan_amount_list = [i[0] for i in loan_amount]
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
x = loan_amount_list
I want the final plot to look like this:
Any help is much appreciated! Thanks!
-- Edit:
Implementing the counter function from collections as suggested below leads to the following plot which is not what I am aiming for:
I dont know what is in your database, in what format it is (If you post that I'll modify my answer) but here's how I'd solve this problem.
I assume that in SELECT LOAN_AMOUNT FROM funded
, LOAN_AMOUNT
is some sort of integer column.
So:
import numpy as np
import matplotlib.pyplot as plt
loan_amount = cur.fetchall()
loan_amount = np.array(loan_amount,dtype='int') #setting up the array in numpy
x ,y = np.unique(loan_amount, return_counts=True) # counting occurrence of each loan
plt.scatter(x,y)
If I feed some random distributed data in to this snipped I get the following picture, which you were probably looking for:
>>> a = np.random.rayleigh(1000,100000)
>>> a = a.astype('int')
>>> x ,y = np.unique(a,return_counts=True)
>>> plt.scatter(x,y)
<matplotlib.collections.PathCollection object at 0x7f3b18a524e0>
>>> plt.show()
>>>
Line plot is a little bit messy but that depends upon your dataset how the result will look: