Search code examples
pythonnumpymatplotlibloglog

Fitting a straight line to a log-log curve


I have a plot with me which is logarithmic on both the axes. I have pyplot's loglog function to do this. It also gives me the logarithmic scale on both the axes.

Now, using numpy I fit a straight line to the set of points that I have. However, when I plot this line on the plot, I cannot get a straight line. I get a curved line. The blue line is the supposedly "straight line". It is not getting plotted straight. I want to fit a straight line to the curve plotted by red dots

The blue line is the supposedly "straight line". It is not getting plotted straight. I want to fit this straight line to the curve plotted by red dots

Here is the code I am using to plot the points:

import numpy
from matplotlib import pyplot as plt
import math
fp=open("word-rank.txt","r")
a=[]
b=[]

for line in fp:
    string=line.strip().split()
    a.append(float(string[0]))
    b.append(float(string[1]))

coefficients=numpy.polyfit(b,a,1)
polynomial=numpy.poly1d(coefficients)
ys=polynomial(b)
print polynomial
plt.loglog(b,a,'ro')
plt.plot(b,ys)
plt.xlabel("Log (Rank of frequency)")
plt.ylabel("Log (Frequency)")
plt.title("Frequency vs frequency rank for words")
plt.show()

Solution

  • Your linear fit is not performed on the same data as shown in the loglog-plot.

    Make a and b numpy arrays like this

    a = numpy.asarray(a, dtype=float)
    b = numpy.asarray(b, dtype=float)
    

    Now you can perform operations on them. What the loglog-plot does, is to take the logarithm to base 10 of both a and b. You can do the same by

    logA = numpy.log10(a)
    logB = numpy.log10(b)
    

    This is what the loglog plot visualizes. Check this by ploting both logA and logB as a regular plot. Repeat the linear fit on the log data and plot your line in the same plot as the logA, logB data.

    coefficients = numpy.polyfit(logB, logA, 1)
    polynomial = numpy.poly1d(coefficients)
    ys = polynomial(b)
    plt.plot(logB, logA)
    plt.plot(b, ys)