I am trying to use the Stata bootstrap
command to estimate the sd of the median of the hours
variable from the nlsw88 dataset.
I can't get the answer but when I changed the variable in my code from hours
to wage
it works. The only difference between these two variables is that hours
is integer while wage
has decimals. I even changed the storage type of hours
from byte
to float
but nothing changed.
sysuse nlsw88
bootstrap r(p50), nodots: summarize hours , detail
You are bootstrapping to estimate the median (p50
), but in practice there is no variance in the median.
Not sure exactly how bootstrap
works, but you are not setting a sample size and 48.75% of the observations has the value 40 which is also close to the mean. All of this makes it very likely that the medians in all samples are 40 and that there therefore is no variance between the samples.
You must pick a very small sample size to be likely to get any variance in the median between the randomly drawn samples. For example, if you only sample 5 observations each bootstrap round.
sysuse nlsw88
bootstrap r(p50) , size(5) nodots: summarize hours , detail
This might explain why you are getting missing values (which is your question), but it does not answer what you should do instead. Ask yourself if you really meant to use the median? Or if bootstrap is the right method to use here?