I am writing code of SVM Primal that uses SGD (Stochastic SubGradient Descent) for optimize the vector W.
The classification methos is sign(w*x + bias).
My question is how to find the best bias for it?
I guess that it has to do during the W optimizing, but how? I have no idea.
Your hypothesis is sign(<w, x> + b)
, think for a second about x' = [x 1]
, then you could express your hypothesis as sign(<w', x'>)
, where w' = [w b]
. I hope it clearly shows that b
is not any different from w
's (the only difference is that your regularizing term ||w||^2
does not involve b
). Thus you just need d L/ d b
where L
is your loss function.