Search code examples
pythonnumpyfinancequantitative-financeapproximation

How to more accurately approximate a set of points?


I would like to approximate bond yields in python. But the question arose which curve describes this better?

import numpy as np
import matplotlib.pyplot as plt

x = [0.02, 0.22, 0.29, 0.38, 0.52, 0.55, 0.67, 0.68, 0.74, 0.83, 1.05, 1.06, 1.19, 1.26, 1.32, 1.37, 1.38, 1.46, 1.51, 1.61, 1.62, 1.66, 1.87, 1.93, 2.01, 2.09, 2.24, 2.26, 2.3, 2.33, 2.41, 2.44, 2.51, 2.53, 2.58, 2.64, 2.65, 2.76, 3.01, 3.17, 3.21, 3.24, 3.3, 3.42, 3.51, 3.67, 3.72, 3.74, 3.83, 3.84, 3.86, 3.95, 4.01, 4.02, 4.13, 4.28, 4.36, 4.4]
y = [3, 3.96, 4.21, 2.48, 4.77, 4.13, 4.74, 5.06, 4.73, 4.59, 4.79, 5.53, 6.14, 5.71, 5.96, 5.31, 5.38, 5.41, 4.79, 5.33, 5.86, 5.03, 5.35, 5.29, 7.41, 5.56, 5.48, 5.77, 5.52, 5.68, 5.76, 5.99, 5.61, 5.78, 5.79, 5.65, 5.57, 6.1, 5.87, 5.89, 5.75, 5.89, 6.1, 5.81, 6.05, 8.31, 5.84, 6.36, 5.21, 5.81, 7.88, 6.63, 6.39, 5.99, 5.86, 5.93, 6.29, 6.07]

a = np.polyfit(np.power(x,0.5), y, 1)
y1 = a[0]*np.power(x,0.5)+a[1]

b = np.polyfit(np.log(x), y, 1)
y2 = b[0]*np.log(x) + b[1]

c = np.polyfit(x, y, 2)
y3 = c[0] * np.power(x,2) + np.multiply(c[1], x) + c[2]

plt.plot(x, y, 'ro', lw = 3, color='black')
plt.plot(x, y1, 'g', lw = 3, color='red')
plt.plot(x, y2, 'g', lw = 3, color='green')
plt.plot(x, y3, 'g', lw = 3, color='blue')
plt.axis([0, 4.5, 2, 8])
plt.rcParams['figure.figsize'] = [10, 5]

The parabolic too goes down at the end (blue), the logarithmic goes too quickly to zero at the beginning (green), and the square root has a strange hump (red). Is there any other ways of more accurate approximation or is it that I'm already getting pretty good?

enter image description here


Solution

  • Your fits look really good! If you wanted more information to compare which of your fits is better, you can look at sum of residuals and covariance of the coefficients.

    a,residuals,cov = np.polyfit(np.power(x,0.5), y, 1, full=True, cov=True)
    

    Residuals is the sum of squared residuals of the least-squares fit. The cov matrix is the covariance of the polynomial coefficient estimates. The diagonal of this matrix is the variance estimates for each coefficient.