I'm trying to calculate this regression formula, but I have problem with the dimension calculation, they are not correct: Where:
X-a matrix with dimensions 200x20, n=200 samples, p=20 predictors,
y-a matrix with dimensions 200x1,
- a sequence of coefficients, dimensions 20x1, and k=1,2,3...
- dimensions 20x200
j- and value from 1...p so from 1...20,
The problem is when I calculate
For example for k=20, k-1=19 i have and the dimensions do not match to do a substraction 200x1 - 200x20 x 1x1 =200x1 - 200x20 will not work.
If I take all the beta vector then it is correct. does this: mean to take the 19th value of Beta and to multiply it with the matrix X?
Source of the formula:
You should be using the entire beta vector at each stage of the calculation.
(Tibshirani has been a bit permissive with his use of notation, perhaps...)
The k
is just a counter for which step of the algorithm we are on. Right at the start (k = 0
or "step 0") we initialise the entire beta vector to have all elements equal to zero:
At each step of the algorithm (steps k = 1, 2, 3...
and so on) we use our previous estimate of the vector beta ( calculated in step k - 1
) to calculate a new improved estimate for the vector beta (). The superscript number is not an index into the vector, rather it is a label telling us at which stage of the algorithm that beta vector was produced.
I hope this makes sense. The important point is that each of the values is a different 20x1 vector.