I have a plot with me which is logarithmic on both the axes. I have pyplot's loglog
function to do this. It also gives me the logarithmic scale on both the axes.
Now, using numpy I fit a straight line to the set of points that I have. However, when I plot this line on the plot, I cannot get a straight line. I get a curved line.
The blue line is the supposedly "straight line". It is not getting plotted straight. I want to fit this straight line to the curve plotted by red dots
Here is the code I am using to plot the points:
import numpy
from matplotlib import pyplot as plt
import math
fp=open("word-rank.txt","r")
a=[]
b=[]
for line in fp:
string=line.strip().split()
a.append(float(string[0]))
b.append(float(string[1]))
coefficients=numpy.polyfit(b,a,1)
polynomial=numpy.poly1d(coefficients)
ys=polynomial(b)
print polynomial
plt.loglog(b,a,'ro')
plt.plot(b,ys)
plt.xlabel("Log (Rank of frequency)")
plt.ylabel("Log (Frequency)")
plt.title("Frequency vs frequency rank for words")
plt.show()
Your linear fit is not performed on the same data as shown in the loglog-plot.
Make a and b numpy arrays like this
a = numpy.asarray(a, dtype=float)
b = numpy.asarray(b, dtype=float)
Now you can perform operations on them. What the loglog-plot does, is to take the logarithm to base 10 of both a and b. You can do the same by
logA = numpy.log10(a)
logB = numpy.log10(b)
This is what the loglog plot visualizes. Check this by ploting both logA and logB as a regular plot. Repeat the linear fit on the log data and plot your line in the same plot as the logA, logB data.
coefficients = numpy.polyfit(logB, logA, 1)
polynomial = numpy.poly1d(coefficients)
ys = polynomial(b)
plt.plot(logB, logA)
plt.plot(b, ys)