R tidyr spread columns across all categories of a given variable

I'm working on a dataset that looks like this.

#Dataframe
df=data.frame(Type=c(1,2,4,5,4,3,3,4,5,1,2,3,2,1,2,3,3,2,1,1,NA),
          Q1=c(1,2,6,8,9,10,2,6,7,4,9,9,1,2,NA,4,3,8,7,6,4),
          Q2=c(1,2,4,NA,8,2,1,2,10,7,5,5,5,8,2,7,4,8,7,5,1))

Context

The dataframe consists of results from a questionnaire.

The first column, Type, refers to the type of employee who answered the questionnaire, where 1 = 'Worker', 2 = 'Factory Lead', 3 = 'Administrative Staff', 4 = 'Middle Management' & 5 = 'Executive'

The second and third column (Q1 & Q2) are questions, rated on a scale of 1 = 'Strongly Agree' to 10 (Strongly Disagree).

What I'm Trying to Achieve

I'd like to compute the total number of responses per Type, based on score. I have created bins for the scores, and they are -

1) Low Agreement - Scores from 0 to 4

2) Medium Agreement - Scores of 5 or 6

3) High Agreement - Scores of 7 or 8

4) Very High Agreement - Scores of 9 or 10

So I'd like to count the number of responses per worker per score bin.

My Attempt

library(dplyr)
library(tidyr)

result=df %>%
gather(Item,response,-1) %>%
filter(!is.na(response)) %>%
group_by(Type,Item) %>%
filter(!is.na(Type)) %>%
summarise(Low=sum(response %in% c(0,1,2,3,4)),
        Medium=sum(response %in% c(5,6)),
        High=sum(response %in% c(7,8)),
        VHigh=sum(response %in% c(9,10)) %>%
spread(Type,-Item)

My logic is I use tidyr library and first gather scores to compute the total responses. Then spread the columns, so that I have the sub-totals by worker and by score category.

For example, for Q1, a column for total responses for Low-Worker, then Medium-Worker, then High-Worker, then Very High-Worker, then Low-Factory Lead, then Medium-Factory Lead .... and so on for all combinations of employees and score categories.

Clearly something is not right in my code.

Desired Output

A dataframe with two rows (Q1 & Q2) and 20 columns (for each employee-score combination).

Any help on this would be highly appreciated.

Solution

Create a data frame of scores

library(tidyr)
library(dplyr)
df <- data_frame(type=c(1,2,4,5,4,3,3,4,5,1,2,3,2,1,2,3,3,2,1,1,NA),
                 q1=c(1,2,6,8,9,10,2,6,7,4,9,9,1,2,NA,4,3,8,7,6,4),
                 q2=c(1,2,4,NA,8,2,1,2,10,7,5,5,5,8,2,7,4,8,7,5,1))

scores <- data_frame(score = 0:10,
                     scorebin = c(rep("Low", 5),
                                  rep("Medium", 2),
                                  rep("High", 2),
                                  rep("Very High", 2)))

Gather the data in long format. Join the scores data frame to add a scorebin column. Group by item, type and scorebin and count the number of answers under each group.

df2 <- df %>%
    gather(item, score, -type) %>% 
    left_join(scores, by = "score") %>% 
    group_by(item, type, scorebin) %>% 
    summarise(n = n()) %>% 
    unite(employeescore, type, scorebin)

Change employeescore to a factor with ordered levels so that they are not displayed in alphabetical order (high, low, medium) but in the correct order (low, medium, high).

employeescoreorder <- scores %>% 
    distinct(scorebin) %>% 
    merge(distinct(df, type)) %>% 
    unite(employeescore, type, scorebin)
df2$employeescore <- factor(df2$employeescore, 
                            levels = employeescoreorder$employeescore)

Spread the data frame in wide format to obtain 20 columns.

df2 %>% 
    spread(employeescore, n)

# A tibble: 2 x 20
# Groups:   item [2]
   item `1_Low` `1_Medium` `1_High` `2_Low` `2_Medium` `2_High` `2_Very High` `4_Low`
* <chr>   <int>      <int>    <int>   <int>      <int>    <int>         <int>   <int>
1    q1       3          1        1       2         NA        1             1      NA
2    q2       1          1        3       2          2        1            NA       2
# ... with 11 more variables: `4_Medium` <int>, `4_High` <int>, `4_Very High` <int>,
#   `5_High` <int>, `5_Very High` <int>, `3_Low` <int>, `3_Medium` <int>, `3_High` <int>,
#   `3_Very High` <int>, NA_Low <int>, `<NA>` <int>