Search code examples
rfor-loopapply

Can you iterate an operation over every row of a dataframe using apply?


I frequently find myself using a for loop to perform row-wise operations involving multiple dataframes, along the lines of the following example:

# Sample data
set.seed(123)
df1 <- data.frame(a = sample(1:100, size = 20), b = sample(letters, size = 20))
df2 <- data.frame(c = sample(1:100, size = 20), d = sample(letters, size = 20))

# Sample loop operation
for (i in 1:nrow(df2)){
  number.2 <- df2$c[i]
  letter.1 <- df1$b[df1$a == number.2]
  
  df2$x[i] <- ifelse(!is_empty(letter.1), paste0(letter.1), paste0("No match"))
}

The code does what I want, I just suspect that there's a more elegant way to go about it using apply.

Is there a way to do this using the apply family of functions?


Solution

  • What you've got here is a long way to code a common operation. This is a one-liner using match(), or you can make it more general to a greater variety of cases and columns using merge or one of dplyr's join functions.

    match works well when you are matching one value. merge or join will work well if you are matching multiple columns. If you have a more complicated condition, like >= instead of ==, then you need a "non-equi join", which is supported in dplyr or data.table.

    df2$result = df1[match(df2$c, df1$a), "b"]
    df2
    #     c d        x result
    # 1  89 v No match   <NA>
    # 2  34 z No match   <NA>
    # 3  93 g No match   <NA>
    # 4  69 p        p      p
    # 5  72 q        x      x
    # 6  76 r No match   <NA>
    # 7  63 y No match   <NA>
    # 8  13 b No match   <NA>
    # 9  82 d No match   <NA>
    # 10 91 m No match   <NA>
    # 11 25 e        w      w
    # 12 38 f No match   <NA>
    # 13 21 c No match   <NA>
    # 14 79 i        q      q
    # 15 41 u No match   <NA>
    # 16 47 o No match   <NA>
    # 17 60 t No match   <NA>
    # 18 16 j No match   <NA>
    # 19  6 x No match   <NA>
    # 20 96 n No match   <NA>