Search code examples
machine-learningdeep-learninggradient-descentderivative

Using quadratic function with unknown constant term, How can i find these unknown constant using gradient descent?


everyone.

I am beginner of machine learning and start learning about gradient descent right now. However, I got a one big problem. Following question is like this :

given numbers [0,0],[1,1],[1,2],[2,1] and 
 equation will be [ f=(a2)*x^2 + (a1)*x + a0 ]

With hand-solving, i got a answer [-1,5/2,0] but it is hard to find out the solution from making a python code with gradient descent with these given data.

In my case, I try to make a code with gradient descent method with easiest and fastest way like :

learningRate = 0.1

make **a series of number of x

initialize given 1,1,1 for a2,a1,a0

partial derivative for a2,a1,a0 (a2_p:2x, a1_p:x, a0_p:1)

gradient descent method : (ex) a2 = a2 - (learningRate)( y - [(a2)*x^2 + (a1)*x + a0] )(a2_p)

ps. Honestly, I do not know what should i put 'x' and 'y' or a2, a1, a0.

However, i got a wrong answer with different result each time. So, I want to get a hint for correct equation or code sequence.

Thank you for reading my lowest level of question.


Solution

  • There are a few errors in your equations

    For the function f(x) = a2*x^2+a1*x+a0, partial derivatives for a2, a1 and a0 are x^2, x and 1, respectively.

    Suppose cost function is (1/2)*(y-f(x))^2

    Partial derivatives of cost function with respect to ai is -(y-f(x))* partial derivative of f(x) for ai, where i belongs to [0,2]

    So, the gradient descent equation is:
    ai = ai + learning_rate*(y-f(x)) * partial derivative of f(x) for ai, where i belongs to [0,2]

    I hope this code helps

    #Training sample
    sample = [(0,0),(1,1),(1,2),(2,1)]
    
    #Our function => a2*x^2+a1*x+a0
    class Function():
        def __init__(self, a2, a1, a0):
            self.a2 = a2
            self.a1 = a1
            self.a0 = a0
        
        def eval(self, x):
            return self.a2*x**2+self.a1*x+self.a0
        
        def partial_a2(self, x):
            return x**2
        
        def partial_a1(self, x):
            return x
        
        def partial_a0(self, x):
            return 1
    
    #Initialise function
    f = Function(1,1,1)
    
    #To Calculate loss from the sample
    def loss(sample, f):
        return sum([(y-f.eval(x))**2 for x,y in sample])/len(sample)
    
    epochs = 100000
    lr = 0.0005
    #To record the best values
    best_values = (0,0,0)
    
    for epoch in range(epochs):
        min_loss = 100
        for x, y in sample:
           #Gradient descent
           f.a2 = f.a2+lr*(y-f.eval(x))*f.partial_a2(x)
           f.a1 = f.a1+lr*(y-f.eval(x))*f.partial_a1(x)
           f.a0 = f.a0+lr*(y-f.eval(x))*f.partial_a0(x)
        
        #Storing the best values
        epoch_loss = loss(sample, f)
        if min_loss > epoch_loss:
            min_loss = epoch_loss
            best_values = (f.a2, f.a1, f.a0)
           
    print("Loss:", min_loss)
    print("Best values (a2,a1,a0):", best_values)
    

    Output:

    Loss: 0.12500004789165717
    Best values (a2,a1,a0): (-1.0001922562970325, 2.5003368582261487, 0.00014521557599919338)