Search code examples
rstatisticscategorical-datamedianiqr

Interquartile range for categorical data


I have been asked to report the descriptive statistics of my results in terms of IQR and median for my categorical variables but I do not know how I can do that! I know the logic but in continuous data.

Can anyone explain how to calculate that on categorical variables? And how to do it in R?


Solution

  • I am assuming you want to calculate median and IQR for variables grouped by a categorical variable. In base R, you can use aggregate for this. You can also look up tidyverse, which has the handy group_by and summarize functions.

    df <- data.frame(
      c("m", "f", "m", "x"),
      c(20, 21, 64, 42),
      c(191, 180, 176, 177)
    )
    names(df) <- c("gender", "age", "length")
    aggregate(length ~ gender, df, IQR)
    aggregate(length ~ gender, df, median)
    

    This has the output

    aggregate(length ~ gender, df, IQR)
      gender length
    1      f    0.0
    2      m    7.5
    3      x    0.0
    
    aggregate(length ~ gender, df, median)
      gender length
      gender length
    1      f  180.0
    2      m  183.5
    3      x  177.0