Search code examples
rdataframetidyversespreaddcast

R - rotating a difficult data frame


Suppose I've got sales data for three sales reps selling a variety of products. What makes it difficult is that each rep sells a different combination of products, and not necessarily the same number of them, either:

Bob sells Products A, B, and C

Mike sells Products A, B, C, and D

Sara sells Products A, B, and E

   RepName Product SalesDollarAmt SalesQty
 1     Bob       A             43        3
 2    Mike       A             14        5
 3    Sara       A             53        1
 4     Bob       B            100       35
 5    Mike       B            215       80
 6    Sara       B            310      105
 7     Bob       C              5        8
 8    Mike       C             10        3
 9    Mike       D            105       50
10    Sara       E             25       18

I want to rotate this on the Product, so that the result looks like this:

  RepName Product.1 SalesDollarAmt.1 SalesQty.1 Product.2 SalesDollarAmt.2 SalesQty.2 Product.3 SalesDollarAmt.3 SalesQty.3 Product.4 SalesDollarAmt.4 SalesQty.4
1     Bob         A               43          3         B              100         35         C                5          8      <NA>                0          0
2    Mike         A               14          5         B              215         80         C               10          3         D              105         50
3    Sara         A               53          1         B              310        105         E               25         18      <NA>                0          0

If they all had the same products, I'd have filtered them by Product into separate dataframes and then joined them back together on RepName. I've tried everything I can think of with spread and dcast. Thanks for any help!

Code for sample data frames:

library(tidyverse)

# initial sales data
df <- tribble(
  ~RepName, ~Product, ~SalesDollarAmt, ~SalesQty,
               #-------------------------------
               "Bob", "A", 43, 3,
               "Mike", "A", 14, 5,
               "Sara", "A", 53, 1,
               "Bob", "B", 100, 35,
               "Mike", "B", 215, 80,
               "Sara", "B", 310, 105,
               "Bob", "C", 5, 8,
               "Mike", "C", 10, 3,
               "Mike", "D", 105, 50,
               "Sara", "E", 25, 18
                )

# ideally rotated data
df2 <- tribble(
  ~RepName, ~Product.1, ~SalesDollarAmt.1, ~SalesQty.1, ~Product.2, ~SalesDollarAmt.2, ~SalesQty.2, ~Product.3, ~SalesDollarAmt.3, ~SalesQty.3, ~Product.4, ~SalesDollarAmt.4, ~SalesQty.4,
  #--------------------------------------------------------------
  "Bob", "A", 43, 3, "B", 100, 35, "C", 5, 8, NA, 0, 0, 
  "Mike", "A", 14, 5, "B", 215, 80, "C", 10, 3, "D", 105, 50,
  "Sara", "A", 53, 1, "B", 310, 105, "E", 25, 18, NA, 0, 0 
)

Solution

  • Using a combination of row_number, gather, spread, and unite, we can reshape the data. It's up to you to reorder the columns if you so choose. In the last line, we specify convert = TRUE in the call to spread. This is due to the fact that when we convert the data to long format (using gather), the column values are converted to character. Specifying convert = TRUE in the call to spread (should) get the values back to a useful form.

    df %>%
      group_by(RepName) %>%
      mutate(product_count = row_number()) %>% # product "id" within RepName
      gather(variable, value, -RepName, -product_count) %>% # reshape to long
      unite(var_prod, variable, product_count) %>%
      spread(var_prod, value, convert = TRUE) # reshape to wide
    
      RepName Product_1 Product_2 Product_3 Product_4 SalesDollarAmt_1 SalesDollarAmt_2 SalesDollarAmt_3 SalesDollarAmt_4 SalesQty_1 SalesQty_2 SalesQty_3 SalesQty_4
    1     Bob         A         B         C      <NA>               43              100                5             <NA>          3         35          8       <NA>
    2    Mike         A         B         C         D               14              215               10              105          5         80          3         50
    3    Sara         A         B         E      <NA>               53              310               25             <NA>          1        105         18       <NA>