Search code examples
rstatisticsmeanmedianstandard-deviation

standardvariation for certain rows


I have a long dataset with different type of questions referred to in the case row.

age <- ("18-30","31-45","60+","46-60", "31-45", "18-30", "60+", "46-60")
gender <- ("M","F","F","F","M","M","F","M")
case <- ("Q1","Q1","Q2","Q2","Q3","Q3","Q4","Q4")
height <- (0,200,310,0,0,175,270,150)

I would like to calculate, the mean, the median and standard deviation per question for the height column. So 4 different tables for Q1, Q2, Q3 and Q4. I my knowledge of r is really limited anyone can help me with it please? thanks in advance


Solution

  • library(dplyr)
    df <- tibble(
      age = c("18-30","31-45","60+","46-60", "31-45", "18-30", "60+", "46-60"),
    gender = c("M","F","F","F","M","M","F","M"),
    case = c("Q1","Q1","Q2","Q2","Q3","Q3","Q4","Q4"),
    height = c(0,200,310,0,0,175,270,150)
    )
    
    df %>% 
      group_by(case) %>% 
      summarise(mean = mean(height), 
                median = median(height), 
                sd = sd(height))
    

    If you want individual dataframes for each case, you can simply filter for the questions you want, i.e. for the first case "Q1"

    df  %>% 
          group_by(case) %>% 
          summarise(mean = mean(height), 
                    median = median(height), 
                    sd = sd(height)) %>%
          filter(case == "Q1")