How to code quadratic form both naively and efficiently

I'm trying to code a quadratic form Z'(S)^{-1} Z

The code is as below

z <- matrix(rnorm(200 * 100), 200, 100)
S <- cov(z)
quad.naive <- function(z, S) {
        Sinv <- solve(S)
        rowSums((z %*% Sinv) * z) 
}

However, I'm not sure I understand thoroughly the last line of the function

rowSums((z %*% Sinv) * z)

Because naively, we should just type exactly the same as the mathematical formula which is

t(Z) %*% Sinv %*% Z

So, anyone can explain why is the row sums form the same as the naive mathematical form, esp. why after two metrics (z, and Sin) multiplication, it use a element-wise multiply symbol * to times Z, rather than use %*%.

(z %*% Sinv) * z

Solution

Ok, following our exchange in the comments and the link you gave to the relevant section in the book Advanced Statistical Computing I think I understand what the issue is.

I post this a separate (and real) answer, to avoid confusing future readers who may want to read through the train of thoughts in the comments.

Let's return to the code given in your post (which is copied from section 1.3.3 Multivariate Normal Distribution)

set.seed(2017-07-13)
z <- matrix(rnorm(200 * 100), 200, 100)
S <- cov(z)
quad.naive <- function(z, S) {
        Sinv <- solve(S)
        rowSums((z %*% Sinv) * z)
}

Considering that the quadratic form is defined as the scalar quantity z' Sigma^-1 z (or in R language t(z) %*% solve(Sigma) %*% z) for a random p × 1 column vector, two questions may arise:

Why is z given as a matrix (instead of a p-dimensional column vector, as stated in the book), and
what is the reason for using rowSums in quad.naive?

First off, keep in mind that the quadratic form is a scalar quantity for a single random multivariate sample. What quad.naive is actually returning is the distribution of the quadratic form in multivariate samples (plural!). z here contains 200 samples from a p = 100-dimensional normal.

Then S is the 100 x 100 covariance matrix, and solve(S) returns the inverse matrix of S. The quantity z %*% Sinv * z (the additional brackets are not necessary due to R's operator precedence) returns the diagonal elements of t(z) %*% solve(Sigma) %*% z for every sample of z as row vectors in a matrix. Taking the rowSums is then the same as taking the trace (i.e. having the quadratic form return a scalar for every sample). Also note that you get the same result with diag(z %*% solve(Sigma) %*% t(z)), but in quad.naive we avoid the double matrix multiplication and additional transposition.

A more fundamental question remains: Why look at the distribution of quadratic forms? It can be shown that the distribution of certain quadratic forms in standard normal variables follows a chi-square distribution (see e.g. Mathai and Provost, Quadratic Forms in Random Variables: Theory and Applications and Normal distribution - Quadratic forms)

Specifically, we can show that the quadratic form (x - μ)' Σ^-1 (x - μ) for a p × 1 column vector is chi-square distributed with p degrees of freedom.

To illustrate this, let's draw 100 samples from a bivariate standard normal, and calculate the quadratic forms for every sample.

set.seed(2020)
nSamples <- 100
z <- matrix(rnorm(nSamples * 2), nSamples, 2)
S <- cov(z)
Sinv <- solve(S)
dquadform <- rowSums(z %*% Sinv * z)

We can visualise the distribution as a histogram and overlay the theoretical chi-square density for 2 degrees of freedom.

library(ggplot2)
bw = 0.2
ggplot(data.frame(x = dquadform), aes(x)) +
    geom_histogram(binwidth = bw) +
    stat_function(fun = function(x) dchisq(x, df = 2) * nSamples * bw)

Finally, results from a Kolmogorov-Smirnov test comparing the distribution of the quadratic forms with the cumulative chi-square distribution with 2 degrees of freedom lead us to fail to reject the null hypothesis (of the equality of both distributions).

ks.test(dquadform, pchisq, df = 2)
#
#   One-sample Kolmogorov-Smirnov test
#
#data:  dquadform
#D = 0.063395, p-value = 0.8164
#alternative hypothesis: two-sided