Search code examples
statastatistics-bootstrap

why bootstrap returns empty and values for standard deviation and CI and others are missing


I am trying to use the Stata bootstrap command to estimate the sd of the median of the hours variable from the nlsw88 dataset.

I can't get the answer but when I changed the variable in my code from hours to wage it works. The only difference between these two variables is that hours is integer while wage has decimals. I even changed the storage type of hours from byte to float but nothing changed.

sysuse nlsw88
bootstrap r(p50),  nodots: summarize hours , detail

Solution

  • You are bootstrapping to estimate the median (p50), but in practice there is no variance in the median.

    Not sure exactly how bootstrap works, but you are not setting a sample size and 48.75% of the observations has the value 40 which is also close to the mean. All of this makes it very likely that the medians in all samples are 40 and that there therefore is no variance between the samples.

    You must pick a very small sample size to be likely to get any variance in the median between the randomly drawn samples. For example, if you only sample 5 observations each bootstrap round.

    sysuse nlsw88
    bootstrap r(p50) , size(5) nodots: summarize hours , detail
    

    This might explain why you are getting missing values (which is your question), but it does not answer what you should do instead. Ask yourself if you really meant to use the median? Or if bootstrap is the right method to use here?