Search code examples
rsapplytapply

Is there a way to vectorize this operation using xapply in R


I have a vector

a <- c("there and", "walk and", "and see", "go there", "was i", "and see", 
"i walk", "to go", "to was")

and a data frame bg where

bg <- data.frame(term=c("there and", "walk and", "and see", "go there", "was i", "and see",
"i walk", "to go", "to was"), freq=c(1,1,2,1,1,2,1,1,1))

I need to create a vectorized version for the following code using either sapply,tapply, or vapply or apply etc

 d <- NULL
 for(i in 1:length(a)){
     temp <- filter(bg,term==a[i])
     d <- rbind(d,temp)
 }

The need is search the bg data when term==a[i] and create a data frame d

I need a vector version as for loops are excruciatingly slow in R.

Here is the sample data

> bg
       term freq
1 there and    1
2  walk and    1
3   and see    2
4  go there    1
5     was i    1
6   and see    2
7    i walk    1
8     to go    1
9    to was    1

and

>d
       term freq
1 there and    1
2  walk and    1
3   and see    2
4   and see    2
5  go there    1
6     was i    1
7   and see    2
8   and see    2
9    i walk    1
10    to go    1
11   to was    1

Thanks


Solution

  • This essentially becomes a merge operation, with a little twist to make sure that the row order follows the order in a:

    out <- merge(bg, list(term=a, sortid=seq_along(a)), by="term")
    out[order(out$sortid),]
    
    #        term freq sortid
    #7  there and    1      1
    #10  walk and    1      2
    #1    and see    2      3
    #3    and see    2      3
    #5   go there    1      4
    #11     was i    1      5
    #2    and see    2      6
    #4    and see    2      6
    #6     i walk    1      7
    #8      to go    1      8
    #9     to was    1      9
    

    Or in data.table 1.9.5, with a nod to @akrun:

    library(data.table)
    out <- data.table(term=a, sortid=seq_along(a))[setDT(bg), on='term']
    out[order(out$sortid)]
    

    Or in dplyr:

    left_join(data.frame(term=a), bg)