Search code examples
rmodels

Plot the F-distribution from an lm object in R


Suppose we have two variables that we wish to build a model from:

set.seed(10239)
x <- rnorm(seq(1,100,1))
y <- rnorm(seq(1,100,1))
model <- lm(x~y)

class(model)
# [1] "lm"

summary(model)
# 
# Call:
# lm(formula = x ~ y)
# 
# Residuals:
#      Min       1Q   Median       3Q      Max 
# -3.08676 -0.63022 -0.01115  0.75280  2.35169 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept) -0.07188    0.11375  -0.632    0.529
# y            0.06999    0.12076   0.580    0.564
# 
# Residual standard error: 1.117 on 98 degrees of freedom
# Multiple R-squared:  0.003416,    Adjusted R-squared:  -0.006754 
# F-statistic: 0.3359 on 1 and 98 DF,  p-value: 0.5635

How do you plot the F-distribution of the model object?


Solution

  • If you check the structure of the summary of your model str(summary(model)), you'll notice that the parameters for the F-distribution of interest can be found by calling summary(model)$fstatistic. The first element in the list is the F-statistic and the following two element are the numerator degrees of freedom and the denominator degrees of freedom, in that order. So to plot the F-distribution, try something like the following

    df <- summary(model)$fstatistic
    curve(df(x, df1 = df[2], df2 = df[3]), from = 0, to = 100)
    

    Alternatively, you can also get the parameters for the F-distribution of interest from the model itself. The numerator degrees of freedom is one less than the number of coefficients in the model and the denominator degrees of freedom is the total number of observations less one more than the number of coefficients in the model.

    df1 <- length(model$coefficients) - 1
    df2 <- length(model$residuals) - df1 - 1
    curve(df(x, df1 = df1, df2 = df2), from = 0, to = 100)