Here is my dataframe. It's not very long - only six rows.
df <- structure(list(Send_Month = c("2021-05", "2021-06", "2021-07",
"2021-05", "2021-06", "2021-07"), Order_Result = c("No", "No",
"No", "Yes", "Yes", "Yes"), Email_Send = c(135, 495, 475, 7,
28, 25), Unique_Email_Opens = c(45, 149, 143, 7, 28, 25), Unique_Email_Clicks = c(6,
21, 10, 7, 28, 25), Total_Orders = c(37, 106, 46, 7, 28, 25)), row.names = c(NA,
-6L), groups = structure(list(Send_Month = c("2021-05", "2021-06",
"2021-07"), .rows = structure(list(c(1L, 4L), c(2L, 5L), c(3L,
6L)), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr",
"list"))), row.names = c(NA, -3L), class = c("tbl_df", "tbl",
"data.frame"), .drop = TRUE), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"))
I'm having trouble picturing how I'm going to get summary results that I can graph into a bar chart. I'm trying to make some groupings here:
When the month is the same (for example, "2021-05") and I look at the Email_Send variable, I can see that 7 out of 142 (which is 135+7) emails that were sent led to an order. I can also see that 7 out of 52 (which is 45+7) emails that were opened led to an order. And 7 out of 13 (which is 6+7) emails that were clicked on led to an order. That is for the "2021-05" group.
How can I create these statistics for each grouping so I can see how the percentage would change for each group, where the denominator keeps shifting?
I tried using the janitor package for a second and just to orient myself I first filtered to only include that 2021-05 group:
df_may <- df %>%
filter(Send_Month == "2021-05")
df_may %>%
adorn_totals("row")
But I don't know if this method is very flexible for looking at all the groups together and also I don't know if I really want a summary row or a new column. So I don't know if I'm heading in the right direction here.
Thank you dear @ThomasIsCoding for an excellent tip of using proporstions
function in place of .x/sum(.x)
.
library(dplyr)
library(purrr)
df %>%
group_by(Send_Month, .add = TRUE) %>%
group_split() %>%
map(~ .x %>%
mutate(across(!c(1, 2), ~ proportions(.x))))
[[1]]
# A tibble: 2 x 6
Send_Month Order_Result Email_Send Unique_Email_Opens Unique_Email_Clicks Total_Orders
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 2021-05 No 0.951 0.865 0.462 0.841
2 2021-05 Yes 0.0493 0.135 0.538 0.159
[[2]]
# A tibble: 2 x 6
Send_Month Order_Result Email_Send Unique_Email_Opens Unique_Email_Clicks Total_Orders
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 2021-06 No 0.946 0.842 0.429 0.791
2 2021-06 Yes 0.0535 0.158 0.571 0.209
[[3]]
# A tibble: 2 x 6
Send_Month Order_Result Email_Send Unique_Email_Opens Unique_Email_Clicks Total_Orders
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 2021-07 No 0.95 0.851 0.286 0.648
2 2021-07 Yes 0.05 0.149 0.714 0.352