Search code examples
pythonmatplotlibregressionextrapolationloglog

Extrapolation in loglog plot in python


I am trying to extrapolate in a loglog plot in python. I did linear regression to fit the data with the best fit curve. Now I want to extend that best fit line to see how the slope goes with an extended range.

My data is really big, so here is a link of my data: my_data

My code looks like this:

import numpy as np
import scipy as sp
import matplotlib.pyplot as plt

from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import InterpolatedUnivariateSpline
from scipy.optimize import curve_fit
import scipy as sp
import scipy.stats

#########################################################
motl = 'motl.txt'
mx, my = np.loadtxt(motl, unpack=True)


print mx
print my

# now do general curve fit for all data

# Regression Function
def regress(x, y):
    #Return a tuple of predicted y values and parameters for linear regression
    p = sp.stats.linregress(x, y)
    b1, b0, r, p_val, stderr = p
    y_pred = sp.polyval([b1, b0], x)
    return y_pred, p

# plotting z
allx, ally = mx, my                              # data, non-transformed
y_pred, _ = regress(np.log(allx), np.log(ally))      # change here           # transformed input             

plt.loglog(allx, ally, marker='p',color ='g', markersize=3,linestyle='None')
plt.loglog(allx, np.exp(y_pred), "k:")  # transformed output


#################################################


# positions to inter/extrapolate
x = np.linspace(12, 14, 1000)
# spline order: 1linear, 2 quadratic, 3 cubic ... 
order = 1
# do inter/extrapolation
s = InterpolatedUnivariateSpline(np.log10(mx), np.log10(my), k=order)
y = s(x)

plt.loglog(10**x, 10**y, 'g:')
#######################################################
plt.show()

With regression, the plot looks like the following:

enter image description here

But how do I extrapolate to extend the line from 10^12 to 10^14? your help is appreciated.


Solution

  • This surely is not a Minimal, Complete and Verifiable example, being neither minimal nor verifiable with a code that throws error messages. For your problem, you just have to extend the x axis used to calculate the regression line. I assume this is

     x = np.linspace(12, 14, 1000)
    

    But since your code produces an error message at line

    s = InterpolatedUnivariateSpline(np.log10(mx), np.log10(my), k=order)
    

    I can't test it. Instead, I just show you a minimal example that achieves your desired output:

    import matplotlib.pyplot as plt
    import numpy as np
    import scipy.stats as stats
    
    motl = 'motl.txt'
    mx, my = np.loadtxt(motl, unpack=True)
    
    #log-log plot of original data
    plt.loglog(mx, my, marker = 'o', color = 'g', markersize = 3, linestyle = 'None')
    #x values for predicted line
    x_pred = np.linspace(13, 16, 1000)
    #linear regression on log-log data using base 10 like for log-log graph
    b1, b0, _r, _p_val, _stderr = stats.linregress(np.log10(mx), np.log10(my)) 
    #corresponding y values using regression data
    y_pred = b1 * x_pred + b0   
    #log-log plot of linear regression curve
    plt.loglog(10 ** x_pred, 10 ** y_pred, color = 'b', linestyle = "-")
    plt.show()