Search code examples
computer-visionconv-neural-networkcalculus

cs231n Convoluted Neural networls


I was watching the online lecture of CS 231n from Stanford. I have a question, maybe I'm getting confused for some reason. The link is: the video

Go to 35:46 and in the backward function, the formula for dx is:

dx = self.y * dz. 

That I don't get since

z = x*y. 

So

dx = dz/y

Can someone please explain me why the difference is happening?


Solution

  • This is just a weird notation in his code (dz,dx,dy are not used in their usual sense). the variable dz here denotes the derivative of the cost function L (of the complete neural network) with respect to z, while the derivatives of L with respect to x and y are noted dx and dy.The derivative of z with respect to x, which is y, is simply given by self.y. With these notations in mind, the rest follows from the chain rule.