How do i fix the missing output (NA) in my summary/coefficients table R

I was building a logistic regression model in r but when I checked the coefficients using summary(model) the output displayed NA's in the four columns (estimate, standard error, z value and z) for one of my independent variables. My other three variables worked fine.

I also checked for any null values but there were none. I changed it between a continuous and discrete value using as.numeric and as.integer but it still comes out as NA in the output. The variable itself measures total volume of blood donated.

I can't figure this out and it is bothering me. Thanks

Solution

Here is an example elaborating on the comment I made above; I'm using a simple linear model here, but the same principle applies for your logistic regression model.

Let's generate some data: We generate data for a model y = x1 + x2 + epsilon, where the two predictor variables x1 and x2 are linearly dependent: x2 = 2.5 * x1.
```
# Generate sample data
set.seed(2017);
x1 <- seq(1, 100);
x2 <- 2.5 * x1;
y <- x1 + x2 + rnorm(100);
```

We fit the model.

df <- cbind.data.frame(x1 = x1, x2 = x2, y = y);
fit <- lm(y ~ x1 + x2, df);

Look at parameter estimates.

summary(fit);
#
#Call:
#lm(formula = y ~ x1 + x2, data = df)
#
#Residuals:
#     Min       1Q   Median       3Q      Max
#-2.50288 -0.75360 -0.01388  0.67935  3.08515
#
#Coefficients: (1 not defined because of singularities)
#            Estimate Std. Error t value Pr(>|t|)
#(Intercept) 0.166567   0.215534   0.773    0.441
#x1          3.496831   0.003705 943.719   <2e-16 ***
#x2                NA         NA      NA       NA
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#Residual standard error: 1.07 on 98 degrees of freedom
#Multiple R-squared:  0.9999,   Adjusted R-squared:  0.9999
#F-statistic: 8.906e+05 on 1 and 98 DF,  p-value: < 2.2e-16

You can see that estimates for x2 are NA. This is a direct consequence of x1 and x2 being linearly dependent. In other words, x2 is redundant, and the data can be described by the estimated linear model y = 3.4968 * x1 + epsilon; this is obviously in good agreement with the theoretical coefficient x1 + 2.5 * x1 = 3.5 * x1.