Search code examples
rexpss

Pivoting the table with multiple variables in a row


I am using the expss package in R to create tables. My apologies as I am having difficulty understanding the documentation.

Here is my sample data:

dput(df)
structure(list(cohort_tracing_complete = structure(c(0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 
0, 0, 0, 0, 2, 2, 0, 2, 2, 2, 2, 0, 2, 2, 2, 2), label = "Form Complete", class = c("labelled", 
"numeric"), labels = c(Incomplete = 0, Unverified = 1, Complete = 2
)), crf_1_eligibility_consent_recruitment_complete = structure(c(2, 
2, 2, 2, 2, 2, 2, 0, 0, 2, 0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), label = "Form Complete", class = c("labelled", 
"numeric"), labels = c(Incomplete = 0, Unverified = 1, Complete = 2
))), row.names = c(NA, -40L), class = c("tbl_df", "tbl", "data.frame"
))

df %>% 
  tab_cells(
    cohort_tracing_complete,
    crf_1_eligibility_consent_recruitment_complete) %>%
  tab_cols(total()) %>%
  tab_stat_cases(
    total_label = NULL,
    total_statistic = "u_cases",
    total_row_position = "below",
  ) %>%
  tab_pivot() %>%
  tab_transpose

Im trying to get my data to look like this

Desired Format enter image description here

I tried to transform my table using the code above and now it look like this

enter image description here

All numbers are in a row.

How can I tweak my code so that I will be able to get the desired format?


Solution

  • I know this is achieved using dplyr and tidyr rather than expss.

    I had a quick look at expss all the examples seem to be about subsetting summary data between at least two variables rather than summary data for two variables.

    library(tidyr)
    library(dplyr)
    
      df %>% 
      pivot_longer(everything()) %>% 
      group_by(name, value) %>% 
      summarise(count = n()) %>%
      pivot_wider(names_from = value, values_from = count)%>% 
      rowwise() %>% 
      mutate(Total = sum(c(Incomplete, Unverified, Complete), na.rm = TRUE))
    
    # A tibble: 2 x 5
    
      name                                           Incomplete Unverified Complete Total
      <chr>                                               <int>      <int>    <int> <int>
    1 cohort_tracing_complete                                28          1       11    40
    2 crf_1_eligibility_consent_recruitment_complete         25         NA       15    40