Splitting a numeric string column into a series of indicator columns

I am working on a data analysis project where I was given a very messy data set. To contextualize, the data is for mosquito surveillance. One of my columns is a series of numbers seperated by " ". Each of the numbers in these strings represent a different type of container that was tested for the presence of absence of mosquitos. For example, these are 4 examples of entries in this column, "1 3 4 5", "1 2 5 888", "1 888", and "2 3 888". There are 6 different numbers used throughout this column (1, 2, 3, 4, 5, and 888). I do not want to enlogate my dataset any further so I am hoping to create 6 different binary indicator columns to mark the presence or absence of each of these container types for each entry. I am fairly new to R so any suggestions or tips that you might have would be greatly appreciated!

For reference, this is the closest that I have gotten to what I'm looking for:

HHAnalysis$container_sondeo_sites <- str_split(HHAnalysis$container_sondeo_sites, " ", 6, TRUE)

However, because the numbers do not follow the same order for each entry, I have numbers in the wrong column. For example, I have 5's in my 2 column, 888's in my 1 column, etc. I appologize if my explaination is confusing. This is my frist entry and I am trying to figure out how to best convey my problem. Thanks in advance!

Solution

Not exactly sure what you need, but this might get you going:

library(tidyr)
df %>%
  # split `string` into separate values:
  separate_rows(string) %>%
  # from these values create new columns each:
    pivot_wider(names_from = string, values_from = string,
                # create binary indicators:
                values_fn = function(x) 1, values_fill = 0)
# A tibble: 4 × 7
  someVar   `1`   `3`   `4`   `5`   `2` `888`
  <chr>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A           1     1     1     1     0     0
2 B           1     0     0     1     1     1
3 C           1     0     0     0     0     1
4 D           0     1     0     0     1     1

Data:

df <- data.frame(
  someVar = LETTERS[1:4],
  string = c("1 3 4 5", "1 2 5 888", "1 888", "2 3 888")
)