I have the following data frame:
> LikelyRenew_ReasonB %>%
+ mutate(cum_Sum = ave(freq,Name,FUN = cumsum))
Name x freq cum_Sum
1 costC 1 10 10
2 costC 2 11 21
3 costC 3 17 38
4 costC 4 149 187
5 productsC 1 31 31
6 productsC 2 40 71
7 productsC 3 30 101
8 productsC 4 86 187
9 communC 1 51 51
10 communC 2 50 101
11 communC 3 34 135
12 communC 4 52 187
13 reimburseC 1 42 42
14 reimburseC 2 26 68
15 reimburseC 3 25 93
16 reimburseC 4 94 187
17 policyC 1 31 31
18 policyC 2 25 56
19 policyC 3 28 84
20 policyC 4 103 187
21 discountC 1 2 2
22 discountC 2 2 4
23 discountC 3 3 7
24 discountC 4 180 187
Here is what the variable looks like:
> dput(head(LikelyRenew_ReasonB))
structure(list(Name = c("costC", "costC", "costC", "costC", "productsC",
"productsC"), x = c(1, 2, 3, 4, 1, 2), freq = c(10L, 11L, 17L,
149L, 31L, 40L)), .Names = c("Name", "x", "freq"), row.names = c(NA,
6L), class = "data.frame")
I am trying to get for each group, the relative frequency for each frequency score and then the sum of the relative frequencies for the group. I put below a sample of what I am looking for - the 1st three rows are their freq/cum_Sum[x==4]. The last row should be a total of those 3 rows.
Is this possible?? I am totally stumped.
Name x freq cum_Sum IdealOutput *how i calculated IdealOutput
1 costC 1 10 10 5.35 (10/187)
2 costC 2 11 21 5.88 (11/187)
3 costC 3 17 38 9.09 (17/187)
4 costC 4 149 187 20.32 (sum of above 3 values)
You can try using dplyr::lag
on cum_Sum
to calculate IdealOutput
for last row of a group.
The last row of a group can be found using condition as row_number() == n()
library(dplyr)
LikelyRenew_ReasonB %>% group_by(Name) %>%
arrange(Name, x) %>%
mutate(cum_Sum = cumsum(freq)) %>%
mutate(IdealOutput = ifelse(row_number() == n(),
lag(cum_Sum)/sum(freq), freq/sum(freq)))
# # A tibble: 6 x 5
# # Groups: Name [2]
# Name x freq cum_Sum IdealOutput
# <chr> <dbl> <int> <int> <dbl>
# 1 costC 1.00 10 10 0.0535
# 2 costC 2.00 11 21 0.0588
# 3 costC 3.00 17 38 0.0909
# 4 costC 4.00 149 187 0.203
# 5 productsC 1.00 31 31 0.437
# 6 productsC 2.00 40 71 0.437
Data:
LikelyRenew_ReasonB <- structure(list(Name = c("costC", "costC", "costC", "costC", "productsC",
"productsC"), x = c(1, 2, 3, 4, 1, 2), freq = c(10L, 11L, 17L,
149L, 31L, 40L)), .Names = c("Name", "x", "freq"), row.names = c(NA,
6L), class = "data.frame")