Resample with replacement by cluster

I want to draw clusters (defined by the variable id) with replacement from a dataset, and in contrast to previously answered questions, I want clusters that are chosen K times to have each observation repeated K times. That is, I'm doing cluster bootstrapping.

For example, the following samples id=1 twice, but repeats the observations for id=1 only once in the new dataset s. I want all observations from id=1 to appear twice.

f <- data.frame(id=c(1, 1, 2, 2, 2, 3, 3), X=rnorm(7))
set.seed(451)
new.ids <- sample(unique(f$id), replace=TRUE)
s <- f[f$id %in% new.ids, ]

Solution

One option would be to lapply over each new.id and save it in a list. Then you can stack that all together:

library(data.table)
rbindlist(lapply(new.ids, function(x) f[f$id %in% x,]))
#  id           X
#1:  1  1.20118333
#2:  1 -0.01280538
#3:  1  1.20118333
#4:  1 -0.01280538
#5:  3 -0.07302158
#6:  3 -1.26409125

Create a Grouping Column/Variable from other Columns in R
Specify the height of the bars in the gglikert function in R
modelsummary modelplot: change linewidth
How to create a conditional panel using a reactive object that is passed from another module?
renderHighchart output not displaying in Shiny App
How can I specify GeoTiff configuration of PixelIsPoint when exporting a raster using R package terra?
Create a list from a dataframe in R
Fast way of converting large list to dataframe
rintrojs only shows first dialog in Safari
Automatically read a column of lowercases True and False as logical
How do I add counts to a stacked bar graph?
Counting the number of rows between each pair of months?
Plot multiple normalized stock charts from different dates into a single plot
Select columns based on string match - dplyr::select
Looking for a more efficient way to replace matrix elements
custom R function with group argument does not work while using the filter
ggsurvplot function, risk table alignment problem
plot running average in ggplot2
Calling variable in df within function
How to find position of running minimum (runMin) in a vector in R?
Using httr2::last_response() in conjunction with purrr::possibly()
Cropping a raster using terra does not return the expected extent
Can janitor::clean_names be used on only certain columns in a data frame?
Efficient way of row binding time series in a data.table, with correctly sorted timestamps
Defining optional arguments in R when more complex function
Can I change the cursor in plotly only when hovering over points?
Issue Loading RStoolbox: "Cannot find proj.db" Error
Filter CSV files for specific value before importing
Matching the same lines from 2 different files and 2 columns
Conditional coloring and outer borders in pdf KableExtra table in R