Search code examples
rmatrixqr-decomposition

Writing a Householder QR factorization function in R code


I am working on a piece of code to find the QR factorization of a matrix in R.

X <- structure(c(0.8147, 0.9058, 0.127, 0.9134, 0.6324, 0.0975, 0.2785, 
0.5469, 0.9575, 0.9649, 0.1576, 0.9706, 0.9572, 0.4854, 0.8003
), .Dim = c(5L, 3L))


myqr <- function(A) {
  n <- nrow(A)
  p <- ncol(A)
  Q <- diag(n)
  Inp <- diag(nrow = n, ncol = p)

  for(k in c(1:ncol(A))) {
    # extract the kth column of the matrix
    col<-A[k:n,k]
    # calculation of the norm of the column in order to create the vector
    norm1<-sqrt(sum(col^2))
    # Define the sign positive if a1 > 0 (-) else a1 < 0(+)  
    sign <- ifelse(col[1] >= 0, -1, +1)  
    # Calculate of the vector a_r
    a_r <- col - sign * Inp[k:n,k] * norm1
    # beta = 2 / ||a-r||^2  
    beta <- 2 / sum(t(a_r) %*% a_r)
    # the next line of code calculates the matrix Q in every step
    Q <- Q - beta *Q %*% c(rep(0,k-1),a_r) %*% t(c(rep(0,k-1),a_r))    
    # calculates the matrix R in each step
    A[k:n,k:p] <- A[k:n,k:p] - beta * a_r %*% t(a_r) %*% A[k:n,k:p]
    }

  list(Q=Q,R=A)
  }

But, Here I have not calculated in every step the matrix H that represents the householder reflection, also I have not calculated the matrix A in every step.

As H = I - 2 v v', if I multiply by Q I obtain

QH = Q - 2 (Qv) v'    // multiplication on the left
HQ = Q - 2 v (Q'v)'    // multiplication on the right

Now, this operations should be work in every step. However if I consider the first matrix H and he the second matrix H1.... these matrices will be of smaller that the first one. In order to avoid that I have used the next line of code:

 Q <- Q - beta * Q %*% c(rep(0,k-1),a_r) %*% t(c(rep(0,k-1),a_r))

but, I am not sure why the code is working well, when I generate the new vector a_r with the first k entries of zeros at every step.


Solution

  • I thought you want exactly the same output as returned by qr.default, which uses compact QR storage. But then I realized that you are storing Q and R factors separately.

    Normally, QR factorization only forms R but not Q. In the following, I will describe QR factorization where both are formed. For those who lack basic understanding of QR factorization, please read this first: lm(): What is qraux returned by QR decomposition in LINPACK / LAPACK, where there are neat math formulae arranged in LaTeX. In the following, I will assume that one knows what a Householder reflection is and how it is computed.


    QR factorization procedure

    First of all, a Householder refection vector is H = I - beta * v v' (where beta is computed as in your code), not H = I - 2 * v v'.

    Then, QR factorization A = Q R proceeds as (Hp ... H2 H1) A = R, where Q = H1 H2 ... Hp. To compute Q, we initialize Q = I (identity matrix), then multiply Hk on the right iteratively in the loop. To compute R, we initialize R = A and multiply Hk on the left iteratively in the loop.

    Now, at k-th iteration, we have a rank-1 matrix update on Q and A:

    Q := Q Hk = Q (I - beta v * v') = Q - (Q v) (beta v)'
    A := Hk A = (I - beta v * v') A = A - (beta v) (A' v)'
    

    v = c(rep(0, k-1), a_r), where a_r is the reduced, non-zero part of a full reflection vector.

    The code you have is doing such update in a brutal force:

    Q <- Q - beta * Q %*% c(rep(0,k-1), a_r) %*% t(c(rep(0,k-1),a_r))
    

    It first pads a_r to get the full reflection vector and performs the rank-1 update on the whole matrix. But actually we can drop off those zeros and write (do some matrix algebra if unclear):

    Q[,k:n] <- Q[,k:n] - tcrossprod(Q[, k:n] %*% a_r, beta * a_r)
    A[k:n,k:p] <- A[k:n,k:p] - tcrossprod(beta * a_r, crossprod(A[k:n,k:p], a_r))
    

    so that only a fraction of Q and A are updated.


    Several other comments on your code

    • You have used t() and "%*%" a lot! But almost all of them can be replaced by crossprod() or tcrossprod(). This eliminates the explicit transpose t() and is more memory efficient;
    • You initialize another diagonal matrix Inp which is not necessary. To get householder reflection vector a_r, you can replace

      sign <- ifelse(col[1] >= 0, -1, +1)
      a_r <- col - sign * Inp[k:n,k] * norm1
      

      by

      a_r <- col; a_r[1] <- a_r[1] + sign(a_r[1]) * norm1
      

      where sign is an R base function.


    R code for QR factorization

    ## QR factorization: A = Q %*% R
    ## if `complete = FALSE` (default), return thin `Q`, `R` factor
    ## if `complete = TRUE`, return full `Q`, `R` factor
    
    myqr <- function (A, complete = FALSE) {
    
      n <- nrow(A)
      p <- ncol(A)
      Q <- diag(n)
    
      for(k in 1:p) {
        # extract the kth column of the matrix
        col <- A[k:n,k]
        # calculation of the norm of the column in order to create the vector r
        norm1 <- sqrt(drop(crossprod(col)))
        # Calculate of the reflection vector a-r
        a_r <- col; a_r[1] <- a_r[1] + sign(a_r[1]) * norm1
        # beta = 2 / ||a-r||^2  
        beta <- 2 / drop(crossprod(a_r))
        # update matrix Q (trailing matrix only) by Householder reflection
        Q[,k:n] <- Q[,k:n] - tcrossprod(Q[, k:n] %*% a_r, beta * a_r)
        # update matrix A (trailing matrix only) by Householder reflection
        A[k:n, k:p] <- A[k:n, k:p] - tcrossprod(beta * a_r, crossprod(A[k:n,k:p], a_r))
        }
    
      if (complete) {
         A[lower.tri(A)] <- 0
         return(list(Q = Q, R = A))
         }
      else {
        R <- A[1:p, ]; R[lower.tri(R)] <- 0
        return(list(Q = Q[,1:p], R = R))
        }
      }
    

    Now let's have a test:

    X <- structure(c(0.8147, 0.9058, 0.127, 0.9134, 0.6324, 0.0975, 0.2785, 
    0.5469, 0.9575, 0.9649, 0.1576, 0.9706, 0.9572, 0.4854, 0.8003
    ), .Dim = c(5L, 3L))
    
    #       [,1]   [,2]   [,3]
    #[1,] 0.8147 0.0975 0.1576
    #[2,] 0.9058 0.2785 0.9706
    #[3,] 0.1270 0.5469 0.9572
    #[4,] 0.9134 0.9575 0.4854
    #[5,] 0.6324 0.9649 0.8003
    

    First for thin-QR version:

    ## thin QR factorization
    myqr(X)
    
    #$Q
    #            [,1]       [,2]        [,3]
    #[1,] -0.49266686 -0.4806678  0.17795345
    #[2,] -0.54775702 -0.3583492 -0.57774357
    #[3,] -0.07679967  0.4754320 -0.63432053
    #[4,] -0.55235290  0.3390549  0.48084552
    #[5,] -0.38242607  0.5473120  0.03114461
    #
    #$R
    #          [,1]       [,2]       [,3]
    #[1,] -1.653653 -1.1404679 -1.2569776
    #[2,]  0.000000  0.9660949  0.6341076
    #[3,]  0.000000  0.0000000 -0.8815566
    

    Now the full QR version:

    ## full QR factorization
    myqr(X, complete = TRUE)
    
    #$Q
    #            [,1]       [,2]        [,3]       [,4]       [,5]
    #[1,] -0.49266686 -0.4806678  0.17795345 -0.6014653 -0.3644308
    #[2,] -0.54775702 -0.3583492 -0.57774357  0.3760348  0.3104164
    #[3,] -0.07679967  0.4754320 -0.63432053 -0.1497075 -0.5859107
    #[4,] -0.55235290  0.3390549  0.48084552  0.5071050 -0.3026221
    #[5,] -0.38242607  0.5473120  0.03114461 -0.4661217  0.5796209
    #
    #$R
    #          [,1]       [,2]       [,3]
    #[1,] -1.653653 -1.1404679 -1.2569776
    #[2,]  0.000000  0.9660949  0.6341076
    #[3,]  0.000000  0.0000000 -0.8815566
    #[4,]  0.000000  0.0000000  0.0000000
    #[5,]  0.000000  0.0000000  0.0000000
    

    Now let's check standard result returned by qr.default:

    QR <- qr.default(X)
    
    ## thin R factor
    qr.R(QR)
    #          [,1]       [,2]       [,3]
    #[1,] -1.653653 -1.1404679 -1.2569776
    #[2,]  0.000000  0.9660949  0.6341076
    #[3,]  0.000000  0.0000000 -0.8815566
    
    ## thin Q factor
    qr.Q(QR)
    #            [,1]       [,2]        [,3]
    #[1,] -0.49266686 -0.4806678  0.17795345
    #[2,] -0.54775702 -0.3583492 -0.57774357
    #[3,] -0.07679967  0.4754320 -0.63432053
    #[4,] -0.55235290  0.3390549  0.48084552
    #[5,] -0.38242607  0.5473120  0.03114461
    
    ## full Q factor
    qr.Q(QR, complete = TRUE)
    #            [,1]       [,2]        [,3]       [,4]       [,5]
    #[1,] -0.49266686 -0.4806678  0.17795345 -0.6014653 -0.3644308
    #[2,] -0.54775702 -0.3583492 -0.57774357  0.3760348  0.3104164
    #[3,] -0.07679967  0.4754320 -0.63432053 -0.1497075 -0.5859107
    #[4,] -0.55235290  0.3390549  0.48084552  0.5071050 -0.3026221
    #[5,] -0.38242607  0.5473120  0.03114461 -0.4661217  0.5796209
    

    So our results are correct!