Can some one with expertise explain how the following vectorized format of multiple linear regression is derived from given independent variable matrix with intercept X and dependent variable matrix Y, with m rows and n columns with n theta parameters? In Andrew Ng class, I am bit lost here on how this and non vectorized cost function are same?
Ah! I think I got the answer. I forgot that what is happening is a square of a vector in the error part of the function. Hence it is transpose of vector.vector. Still not able to understand how X is defined with transposes of all independent variables in above definition, as I believe it is a matrix of dependent variables including intercept.