I'm slightly confused by the as.svrepdesign
function's use of the fpc from a design object.
The example from the documentation shows the following:
## one-stage cluster sample
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
## convert to bootstrap
bclus1<-as.svrepdesign(dclus1,type="bootstrap", replicates=100)
So that seems fine. My understanding is it will essentially use the bootstrap to compute statistics of interest over the survey design.
But is the FPC applied or not? The reason I suggest it is not is because if you choose "subbootstrap" the code runs fine. But it shouldn't. Subbootstrap is not available for finite population correction.
So I'm confused: Is the FPC applied or not when using as.svrepdesign
? If it is not applied, I'm not entirely clear how to compute the vector required for as.svrepdesign
for a one-cluster design.
Update
I'll add, I'm not the first to note the confusion around FPC and bootstrap resampling in the survey
package. This is from Mashreghi et al (2016):
The user guide of version 3.30-3 of the package, dated February 20, 2015, presents the functions bootweights, subbootweights, and mrbweights. According to the documentation, “Bootstrap weights for infinite populations (’with replacement’ sampling) are created by sampling with replacement,” suggesting that the methods do not take into account that the population is finite. The function bootweights is deemed to implement the method of Canty and Davison (1999). But to simplify the discussion, that paper assumes that N/n is an integer and presents what seems like the algorithm of Gross (1980) and the case of a non-integer N/n is not discussed. Since the paper refers to Section 3.7 of Davison and Hinkley (1997), it suggests that bootweights implements the method of Booth et al. (1994). The function subbootweights seems to implement the Rao et al. (1992) method although the reference is incorrect and no finite population correction is included, i.e., it is as if f = 0 in the weights adjustment formula of Table 5 and so is not appropriate if the sampling fraction is large. On the other hand, it is clear that the function mrbweights is for the multistage method of Preston (2009). The documentation clearly mentions that “these bootstraps are strictly appropriate only when the first stage of sampling is a simple or stratified random sample of PSUs with or without replacement, and not (eg) for PPS sampling”. In fact, Preston’s method requires that simple random sampling be used at all stages, not only the first one.
Since fpc
is available for bootstrap and Preston's multi-scale bootstrap but not for Rao & Wu's n-1 bootstrap, it uses fpc for bootstrap
and mrb
but not for subbootstrap
weights.