Search code examples
rdplyrtibble

How to create tibble summarising missing imaging data in R


I am working with imaging data in a format similar to this:

   name  side  modality
   <chr> <chr> <chr>   
 1 alex  right xray    
 2 alex  left xray    
 3 brad  right xray    
 4 brad  left  xray    
 5 alex  right ct      
 6 alex  left  ct      
 7 brad  right ct      
 8 alex  right mri     
 9 brad  right mri     
10 brad  left  mri

Given each person is supposed to have left and right images of all modalities, it shows that Alex is missing a left MRI, Brad is missing a left CT, and Charlie (who doesn't appear in data at all) has all images missing. I am trying to create a summary table that shows which elements are 'present' or 'absent', given a list of names(where Charlie is included). It would look something like this:

  name    left_xray right_xray left_ct right_ct left_mri right_mri n_absent
  <chr>   <chr>     <chr>      <chr>   <chr>    <chr>    <chr>        <dbl>
1 alex    present   present    present present  absent   present          1
2 brad    present   present    absent  present  present  present          1
3 charlie absent    absent     absent  absent   absent   absent           6

I have used various dplyr verbs to get a list of patients with missing data for each modality, but I'm not really sure where to start with creating a summary table.

Dummy data:

data <- tibble(name = c('alex', 'alex', 'brad', 'brad', 'alex', 'alex', 'brad', 'alex', 'brad', 'brad'),
                        side = c('right', 'left', 'right', 'left', 'right', 'left', 'right', 'right','right','left'),
                        modality = c('xray','xray','xray','xray','ct','ct','ct','mri','mri','mri'))

names <- tibble(name = c('alex', 'brad', 'charlie'))

Thank you!


Solution

  • Code

    library(dplyr)
    library(tidyr)
    
    expand_grid(
      name = c('alex', 'brad', 'charlie'),
      modality = c("xray","ct","mri"),
      side = c("right",'left')
      ) %>% 
      left_join(
        data %>% 
          mutate(aux = "present")
      )  %>% 
      mutate(aux = replace_na(aux,"absent")) %>% 
      unite(modality_side,side,modality) %>% 
      pivot_wider(names_from = modality_side,values_from = aux) %>%
      rowwise() %>% 
      mutate(n_absent = sum(c_across(-name) == "absent"))
    

    Output

    # A tibble: 3 x 8
    # Rowwise: 
      name    right_xray left_xray right_ct left_ct right_mri left_mri n_absent
      <chr>   <chr>      <chr>     <chr>    <chr>   <chr>     <chr>       <int>
    1 alex    present    present   present  present present   absent          1
    2 brad    present    present   present  absent  present   present         1
    3 charlie absent     absent    absent   absent  absent    absent          6