Search code examples
pythonmathstatisticssympymathematical-optimization

Using maximum likelihood to derive regression coefficients in Python


As a learning exercise for myself, I am trying to estimate the regression parameters using the MLE method in Python.

From what I have gathered, the log-likelihood function boils down to maximizing the following:

enter image description here

So I need to take partial derivatives with respect to the intercept and the slope, setting each to zero, and this should give me the coefficients.

I have been trying to approach this using sympy as follows:

from sympy import *
b = Symbol('b')# beta1
a = Symbol('a')# intercept
x = Symbol('x', integer=True)
y = Symbol('y', integer=True)
i = symbols('i', cls=Idx)

x_values = [2,3,2]
y_values = [1,2,3]
n = len(x_values)-1

function = summation((Indexed('y',i) - a+b*Indexed('x',i))**2, (i, 0, n))

partial_intercept = function.diff(a)

print(partial_intercept)
# 6*a - 2*b*x[0] - 2*b*x[1] - 2*b*x[2] - 2*y[0] - 2*y[1] - 2*y[2]

intercept_f = lambdify([x, y], partial_intercept)

inter = solve(intercept_f(x_values, y_values), a)

print(inter)
# [7*b/3 + 2]

I would have expected a single value for the slope, such that the 'b' variable is gone. However, I see that this wouldn't be possible given that the variable b is still there in my derivative equation.

Does anyone have any advice on where I am going wrong?

Thanks!

Edit : Fixed a typo in the codeblock


Solution

  • The expression 7*b/2 + 2 at the end tell you that we have to satisfies a = 7*b/2 + 2, it depends on the quantity of b.

    You should solve for both a and b as a system simultaneously.

    In the following code, I find the relationship that a and b has to satisfies and solve them simultaneously.

    from sympy import *
    
    b = Symbol('b')# beta1
    a = Symbol('a')# intercept
    x = Symbol('x', integer=True)
    y = Symbol('y', integer=True)
    i = symbols('i', cls=Idx)
    
    x_values = [2,3,2]
    y_values = [1,2,3]
    n = len(x_values)-1
    
    function = summation((Indexed('y',i) - a+b*Indexed('x',i))**2, (i, 0, n))
    
    partial_intercept = function.diff(a)
    
    print(partial_intercept)
    # 6*a - 2*b*x[0] - 2*b*x[1] - 2*b*x[2] - 2*y[0] - 2*y[1] - 2*y[2]
    
    intercept_f = lambdify([x, y], partial_intercept)
    
    inter = solve(intercept_f(x_values, y_values), a)
    
    print(inter)
    #[7*b/3 + 2]
    
    partial_gradient = function.diff(b)
    
    print(partial_gradient)
    # 6*a - 2*b*x[0] - 2*b*x[1] - 2*b*x[2] - 2*y[0] - 2*y[1] - 2*y[2]
    
    intercept_f = lambdify([x, y], partial_gradient)
    
    inter2 = solve(intercept_f(x_values, y_values), b)
    print(inter2)
    
    ans = solve([a-inter[0], b-inter2[0]])
    print(ans)
    

    Here are the outputs:

    6*a - 2*b*x[0] - 2*b*x[1] - 2*b*x[2] - 2*y[0] - 2*y[1] - 2*y[2]
    [7*b/3 + 2]
    2*(-a + b*x[0] + y[0])*x[0] + 2*(-a + b*x[1] + y[1])*x[1] + 2*(-a + b*x[2] + y[2])*x[2]
    [7*a/17 - 14/17]
    {a: 2, b: 0}
    

    a should be set to be 2 and b should be set to be 0.