I want to summarise several columns from a data.frame. The grouping and summary was achieved with dplyr
, as in the example below.
df = data.frame (time = rep(c("day", "night"), 10) ,
who =rep(c("Paul", "Simon"), each=10) ,
var1 = runif(20, 5, 15), var2 = runif(20, 10, 12), var3 = runif(20, 2, 7), var4 = runif(20, 1, 3))
Writting the function I need
quantil_x = function (var, num) {
quantile(var, num, na.rm=T)
}
Using it at var1
and exporting
percentiles = df %>% group_by(time, who) %>% summarise(
P0 = quantil_x (var1, 0),
P25 = quantil_x (var1, .25),
P75 = quantil_x (var1, .75)
)
write.table(percentiles, file = "summary_var1.csv",row.names=FALSE, dec=",",sep=";")
What I want is to repeat this same task for 'var2'
, 'var3'
and 'var4'
. I have tried to run a loop with no success to perform this task multiple times. Unfortunately I couldn't find a way to handle distinct calls of variables within the code. That is, within the loop I have tried to use summarise_()
, tried to use get()
inside the fuction quantil_x()
or within summarise
, also as.name
but none of this worked.
I'm pretty sure this is a bad coding skill issue, but that's all I came up with so far. Here is an example of what I tried to do:
list = c("var1", "var2", "var3", "var4")
for (i in list){
percentiles = df %>% group_by(time, who) %>% summarise(
P0 = quantil_x (get(i), 0),
P25 = quantil_x (get(i), .25),
P75 = quantil_x (get(i), .75)
)
write.table(percentiles, file = paste0("summary_",i,".csv",row.names=FALSE, dec=",",sep=";")
}
I read this post, but didn't help much on my case.
Thanks in advance.
You can do this with gather()
percentiles = df %>%
gather(Var,Value,var1,var2,var3) %>%
group_by(Var,time, who) %>%
summarise(
P0 = quantil_x (Value, 0),
P25 = quantil_x (Value, .25),
P75 = quantil_x (Value, .75)
)