Search code examples
rloopssample

R loop `sample()` function for each new row


library(tidyverse)
fruit %>% 
  as_tibble() %>%
  transmute(fruit = value, fruit.abr = substring(value, 1, sample(3:6, 1)))

#> # A tibble: 80 x 2
#>    fruit        fruit.abr
#>    <chr>        <chr>    
#>  1 apple        app      
#>  2 apricot      apr      
#>  3 avocado      avo      
#>  4 banana       ban      
#>  5 bell pepper  bel      
#>  6 bilberry     bil      
#>  7 blackberry   bla      
#>  8 blackcurrant bla      
#>  9 blood orange blo      
#> 10 blueberry    blu      
#> # ... with 70 more rows

I'd like my abbreviated fruit column to be a random string length between 3 and 6 characters. Each row would be a different string length (between 3 and 6).

The way I wrote my code a sample between 3 and 6 is chosen once, and then used for every row. How do I "recycle" or "loop" this sample() function to make it select a new value for each row (eg 3, 6, 4, 3, 5, etc.)?


Solution

  • sample(3:6, 1) returns a single value and will be recycled to the length of rows. You should draw a sample with the same size as the number of rows at a time. Remember to set replace = TRUE to take a sample with replacement.

    fruit %>% 
      as_tibble() %>%
      transmute(fruit = value, fruit.abr = substring(value, 1, sample(3:6, n(), TRUE)))
    
    # # A tibble: 10 x 2
    #    fruit        fruit.abr
    #    <chr>        <chr>    
    #  1 apple        "app"    
    #  2 apricot      "apr"    
    #  3 avocado      "avoca"  
    #  4 banana       "banana" 
    #  5 bell pepper  "bell "  
    #  6 bilberry     "bilbe"  
    #  7 blackberry   "blac"   
    #  8 blackcurrant "blac"   
    #  9 blood orange "blo"    
    # 10 blueberry    "blu"
    

    Data

    fruit <- structure(list(value = c("apple", "apricot", "avocado", "banana", 
    "bell pepper", "bilberry", "blackberry", "blackcurrant", "blood orange", 
    "blueberry")), class = "data.frame", row.names = c(NA, -10L))