Search code examples
rlistsplit

How to retrieve all texts that a word appear


I have a data frame consisting of words that appear in different texts.

     word text
1     a    1
2     a    2
3     a    5
4     b    1
5     b    3
6     c    1
7     c    3
8     c    4
9     d    4
10    e    2
11    e    4
12    f    3
13    g    2
14    h    5
15    i    5

I want to have an output for each word like this:

a
[1] 1 2 5
b
[1] 1 3 
...

And is there any way that all the texts of each word can be retrieved at the same time without having to type "a" or "b" each time I want to find the texts of a particular word? Many thanks!


Solution

  • tidyverse

    df <- data.frame(
      stringsAsFactors = FALSE,
                  word = c("a","a","a","b","b","c",
                           "c","c","d","e","e","f","g","h","i"),
                  text = c(1L,2L,5L,1L,3L,1L,3L,4L,
                           4L,2L,4L,3L,2L,5L,5L)
    )
    
    library(tidyverse)
    
    df %>% 
      nest_by(word) %>% 
      deframe() %>% 
      map(pull)
    #> $a
    #> [1] 1 2 5
    #> 
    #> $b
    #> [1] 1 3
    #> 
    #> $c
    #> [1] 1 3 4
    #> 
    #> $d
    #> [1] 4
    #> 
    #> $e
    #> [1] 2 4
    #> 
    #> $f
    #> [1] 3
    #> 
    #> $g
    #> [1] 2
    #> 
    #> $h
    #> [1] 5
    #> 
    #> $i
    #> [1] 5
    

    Created on 2023-11-29 with reprex v2.0.2