I wish to draw 1,000 random samples of size 50 from the dataset and show that E(xi^ui) = 0 holds for each simulated sample. My code is below, and I have been trying to debug for some time now, but I can't figure out what's wrong.
The dataset is called 'dataset' and it has two columns: 'y' and 'x'. I want to regress y on x, get the residuals, and show that they are not correlated with x.
lm.fit <- NA
resid.lm.fit <- NA
correlation <- NA
for (i in 1:1000){
samp1 <- sample(dataset, size=50, replace=T)
lm.fit[i] <- lm(y ~ x, data=samp1)
resid.lm.fit[i]<-resid(lm.fit[i])
correlation[i] <- cor.test(resid.lm.fit[i],samp1$x)
}
The errors I am getting are:
Error in resid.lm.fit[i] <- resid(lm.fit[i]) :
replacement has length zero
In addition: Warning message:
In lm.fit[i] <- lm(y ~ x, data = samp1) :
number of items to replace is not a multiple of replacement length
The problem is that you are trying to store a bunch of non-atomic objects in vectors. If you make lm.fit
, resid.lm.fit
, and correlation
lists instead of vectors you will be okay:
set.seed(123)
dataset <- data.frame(
x=1:250,
y=3*(1:250)+rnorm(250,0,40))
##
lm.fit <- list(NULL)
resid.lm.fit <- list(NULL)
correlation <- list(NULL)
for (i in 1:1000){
samp1 <- dataset[sample(1:nrow(dataset), size=50, replace=T),]
lm.fit[[i]] <- lm(y ~ x, data=samp1)
resid.lm.fit[[i]] <- resid(lm.fit[[i]])
correlation[[i]] <- cor.test(resid.lm.fit[[i]],samp1$x)
}
Then you can check the results of individual cor.test
s like this:
> correlation[[1]]
Pearson's product-moment correlation
data: resid.lm.fit[[i]] and samp1$x
t = 0, df = 48, p-value = 1
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.2783477 0.2783477
sample estimates:
cor
2.991262e-17
Also, to sample from a data.frame
use df[ sample(1:nrow(df),...), ]
, not sample(df,...)
.