So I have a csv which I have manipulated to give me this table called Completion
dput(Completion)
structure(list(value = structure(1:16, .Label = c("0%", "100%", "13%", "15%","16%", "24%", "26%", "28%", "33%", "40%", "50%", "53%", "66%", "73%", "75%", "93%"), class = "factor"), All = c(13L, 0L, 3L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 2L, 0L, 1L, 0L, 3L),
M0 = c(14L, 10L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L, 0L), M1 = c(17L, 6L, 0L, 0L, 0L, 0L, 0L, 1L,
0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L), M2 = c(21L, 4L, 0L, 0L,
1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), M3 = c(21L,
2L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 0L)),
.Names = c("value", "All", "M0", "M1", "M2", "M3"), row.names = c(NA, 16L), class =
c("cast_df", "data.frame"), idvars = "value", rdimnames = list(structure(list(value =
structure(1:16, .Label = c("0%", "100%", "13%", "15%", "16%", "24%", "26%", "28%", "33%",
"40%", "50%", "53%", "66%", "73%", "75%", "93%"), class = "factor")),
.Names = "value", row.names = c("0%", "100%", "13%", "15%", "16%", "24%", "26%", "28%",
"33%", "40%", "50%", "53%", "66%", "73%", "75%", "93%"), class = "data.frame"),
structure(list(Module = structure(1:5, .Label = c("All",
"M0", "M1", "M2", "M3"), class = "factor")), .Names = "Module", row.names = c("All",
"M0", "M1", "M2", "M3"), class = "data.frame")), .Label =
c("0-49","50-59","60-69", "70-79",
"80-89", ">90"))
This table shows the number of people who completed (value)% of a given module in a course.
What I would like to do is create categories for the values as follows; Majority(90%-100%), Substantial(75%-89%), Partial(50%-74%), Minimal(1%-49%), and Incomplete(0%) I would then like to be able to get a count of all of the instances of this by module, like this:
M0 M1 M2 M3 All
Majority 1 2 3 4 5
Substantial 5 4 3 2 1
Partial 4 3 2 1 5
Minimal 3 2 1 5 4
Incomplete 2 1 5 4 3
The count of each of the instances would fill up the table.
Is there a way to do this? I have tried making different pivot tables, and attaching levels;
comp.rate <- Completion$value
comp.rate <- ordered(comp.rate, levels = c("Majority", "Substantial", "Partial", "Minimal",
"Incomplete"))
I think I don't know how to assign the values I want to these levels and then, in turn, get R to spit out their occurrences. I have also had problems because the values are in percentage form... and not decimal. I haven't attached my full code, but I can if it is of any use... Any help is greatly appreciated. Thanks!
You can do something like this:
First you have to transform the percentage column to a numeric one using gsub
d$value <- as.numeric(gsub("[%]", "", d$value))
then you can count all occurences using a condition and colSums
:
#Majority(90%-100%)
colSums(d[ d$value >= 90, ])
value All M0 M1 M2 M3
193 3 10 6 4 2
# or Substantial(75%-89%)
colSums(d[ d$value >= 75 & d$value < 89, ])
value All M0 M1 M2 M3
75 0 0 0 0 2
Saving all together in one data.frame:
a1 <- colSums(d[ d$value >= 90,])
a2 <- colSums(d[ d$value >= 75 & d$value < 89, ])
result <- data.frame(rbind(a1, a2))
result$value <- c("Majority", "Substantial")
result
value All M0 M1 M2 M3
a1 Majority 3 10 6 4 2
a2 Substantial 0 0 0 0 2