Add dummy column to flag as the row is randomly selected or not

Suppose I have the following data set (named data).

id var1 var2
1   A   33
2   B   23
3   A   45
4   A   55
5   B   22
6   A   33
7   B   90
8   A   78
9   B   12
10  A   11

My intention is to add a new column to the original data set that indicates whether each row of data set is randomly selected or not (1/0). I tried the following.

library(sampling)
data1 <- strata(data,"var1", size=c(4,3),method="srswor") #stratified random sampling
data2 <- getdata(data,data1)  # this gives a separate data set

Any help, please? Thanks!

Solution

If you look in the documentation of sampling::strata() you'll find the following information:

The function produces an object, which contains the following information:

ID_unit 
the identifier of the selected units.

Stratum 
the unit stratum.

Prob    
the unit inclusion probability.

ID_Unit can used to subset the original data and assign the boolean you asked for:

data<-structure(list(id=c(1,2,3,4,5,6,7,8,9,10),var1=c("A",
"B","A","A","B","A","B","A","B","A"),var2=c(33,23,
45,55,22,33,90,78,12,11)),row.names=c(NA,-10L),class=c("tbl_df",
"tbl","data.frame"))


library(sampling)
data1 <- strata(data,"var1", size=c(4,3),method="srswor") #stratified random sampling
data2 <- getdata(data,data1)  # this gives a separate data set

data$sampled <- FALSE
data[data1$ID_unit, "sampled"] <- TRUE                 
data
#>    id var1 var2 sampled
#> 1   1    A   33   FALSE
#> 2   2    B   23    TRUE
#> 3   3    A   45   FALSE
#> 4   4    A   55    TRUE
#> 5   5    B   22   FALSE
#> 6   6    A   33    TRUE
#> 7   7    B   90    TRUE
#> 8   8    A   78    TRUE
#> 9   9    B   12    TRUE
#> 10 10    A   11    TRUE

^{Created on 2020-07-28 by the reprex package (v0.3.0)}